Example Setup:
-local DNS servers = Server 2003
-local domain = sub2.sub1.public.com (and only this zone)
-public DNS (hosted at dnsmadeeasy.com) = A record *.public.com (wildcard; resolves to 1.2.3.4 [the public website homepage]; there are other A and CNAME records which all work as required [ie: server1.public.com resolves correctly])
-target address= clientid.sharepoint.com (for this example say it resolves to 5.6.7.8)
Issue:
-user wants to browse to "clientid.sharepoint.com", but gets nothing at all (times out)
Some details:
-from the desktop nslookup for "clientid.sharepoint.com" returns something like "clientid.sharepoint.com.sub1.public.com = 1.2.3.4" (not what we expected)
-from the desktop nslookup for "clientid.sharepoint.com." (stop resolving additional parent/sub domains) returns something like "clientid.sharepoint.com = 5.6.7.8" (which is what we want, but browsers do not like the trailing dot)
-desired result would be for "clientid.sharepoint.com" to return 5.6.7.8
Notes:
-I do not think sub1.public.com is a real name space (does not actually exist). It seems to return the same 1.2.3.4 ip when queried though, which may be expected (see below).
-what I think is happening is the workstation is sending clientid.sharpoint.com to the local DNS server and the server does something like this:
- do I know host "clientid.sharpoint.com" = no, try appending the current domain suffix
- do I know host "clientid.sharpoint.com.sub2.sub1.public.com" = no, try parent domain
- do I know host "clientid.sharpoint.com.sub1.public.com" = no [does not exist], but there is wildcard for public.com which is 1.2.3.4
- return "clientid.sharpoint.com.sub1.public.com" = 1.2.3.4
I have been able to work around this for a specific record by created an A record for "clientid.sharpoint.com" in the local domain with the current IP so the (incorrect) fqdn returns 5.6.7.8 [ie: clientid.sharpoint.com.sub2.sub1.public.com = 5.6.7.8]. This will of course fail if Microsoft moves the SharePoint account to another server in the future.
I have no history on why the wildcard exists but suspect it is intentional to capture mistyped subdomains/hostnames. Personally I'd like to see it gone. I think the intent is for <doesnotexist>.public.com to redirect to public.com.
This is the first time I've seen this behavior from DNS. That said I have not had anyone with their internal domain name matching part of their public domain name before.
My fear is this may also be affecting other domain name resolution and I will get more calls as the caches expire or other problems over time.
Questions:
-is this expected behavior for windows DNS with overlapping domain names ("split DNS" comes to mind)?
-can I leave the public wildcard and make the local DNS resolve "clientid.sharepoint.com" by querying the ".com" name space rather than the public.com namepspace?
I imagine my telling them to remove the wildcard will not be received well. My hope is to adjust the DNS so that if the hostname and first FQDN attempts fail it queries the DNS forwards without attempting to "follow the dots" upstream.