Google Apps, Talk / XMPP, Andriod Failures

We recently decided to disable the Google Talk feature for our Google Apps for Business and use our own Jabber/XMPP server to improve monitoring and audit-ability of these communications. Turns out Google does not like it when you dis-connect their services and it will fully break email services on your phone.

Disabling Google Talk / XMPP

This process is simple enough.

  1. Create your own XMPP Server (ejabberd works well)
  2. Update the DNS Service Records
  3. Disable the Google Talk service in the control panel for Google Apps

At this point you’ll notice that your Android phone starts throwing authentication errors for Google Talk, even if the Talk service is not running, disabled and such. This error about Talk authentication is actually coming from the Google Mail service on the phone. It appears every time the phone tries to sync the mail or calendar.

We had hoped this issues would be resolved after all systems were updated but even after 24 hours these errors/issues were still affecting all users.

The actual error is this:

Authentication Error
Google Talk failed to login. If
this is a Google Apps account,
confirm that Chat service is
enabled for this account.
  [ Retry ] [ Cancel ]

If you press Retry it will of course retry and fail, if you press Cancel then your phone will no longer receive mail on that account, leaving you bricked. Don’t worry, it will throw the error again in a few minutes – and still not deliver email to your phone. This error still appears even when Sync is fully disabled for the affected account.

Attempts a Resolution

We attempted multiple ways to try to fix this. The easiest fix was to simply remove that Google Apps account from the phone and re-add it – all was well.

However, that does not work on the primary Google Account on the phone as that one cannot be removed! Now you have a broken account that cannot be removed, re-created and updated to function properly.

So we opened a trouble ticket with Google, via email, and waited 36 hours without a response. As we have Google Apps for Business we also get telephone support for “down” emergencies, such as when our phones cannot receive any email messages.

Three times we called Google (877.355.5787) and were disconnected by their system after entering our support PIN, that sucked. We finally got a hold of them to explain the issue, went like this:

  • Me: My phone no longer receives messages. I disabled Talk services and now phones cannot receive email.
  • Google: I’ll have to dispatch this to our technical team
  • Me: [ waits 10 minutes ]
  • Google: What network connection are you using wee-fee?
  • Me: You mean why-fi?
  • Google: Here is your ticket number, we’ll call you back within 24 hours. Anything else we can help you with?
  • Me: It would help to get support now, not some mythical time in the future, but we have no choice, I guess I’ll wait with broken email.

They sent an email to follow up with my email down issue, awesome. Of particular note was that our domain name was listed as "Edoceo, Inc.", affected accounts were mis-identified, and the carrier was noted as "Striant" – not "Sprint" like I explained and spelled for the CSR who took the call (their command of the English language was not good).

Hard to have confidence in Google when the service is so fragile and the "support" team is so very sub-par.

How to Fix (what Google should do)

  1. Google Mail should not be attempting to sign-in to Chat when Chat is disabled for this domain. Perhaps it could update settings on the account when service options change.
  2. An option to disable this would be handy. There are no settings for the Chat service in the Google Mail account setup on the phone, only for Contacts, Mail and Calendar.
  3. Not forcing users to tie their phones to an un-removable Google Account would be handy.
  4. Calling the product by a single name would help, Chat and Talk are the same, but confused the CSR.

Update from Google #1

Google advised to just add the account (yes, the one that was already added and that could not be removed because it was the primary). So their first attempt at "support" was #fail. Clearly their support reps did not even read the report we submitted. Also of note here is that follow-up to our support telephone call was an email from a support rep that only contained canned instructions. Google clearly has a lot to learn about customer service.

Update from Google #2

Supposedly can simply ‘Resync Account’ by visiting Settings » Accounts & Sync » [ your account ] » Sync Now. We tried that, waited but the error still appeared.

Update from Edoceo #1

That clue from Google may have started the right process. We noticed still at issue was the Google hosted servers still (even after 24h) think they are responsible for XMPP services for edoceo.com (details below). While that is not true, their updates appear to just be taking a while to publish to all their systems. A typical problem for large scale service providers.

Fixed: Clearly caching is an issue, which led us to the following solution.

  1. Settings » Accounts & Sync », for each affected account
  2. Uncheck All Sync Options (Contacts, Gmail, Calendar)
  3. Wait for two hours (could be less, but that’s how long we waited)
  4. Revist account options, re-enable Sync for desired services

This will even work to fix issues on the primary account, which cannot be removed. Once it came back on-line the errors from Google Talk Authentication were no longer present.

On another device, which had the affected account not the primary, we were able to remove and re-add the account to fix. Duh, right?

Here is the details about the issue from our eJabberd verbose logs.

<stream:error>
  <undefined-condition xmlns="urn:ietf:params:xml:ns:xmpp-streams" />
    <str:text xmlns:str="urn:ietf:params:xml:ns:xmpp-streams">edoceo.com is a Google Apps Domain with Talk service enabled.</str:text>
  </stream:error>
&lt/stream:stream>

<iq from="$buddy@gmail.com" to="me@edoceo.com/Office" type="error" id="purplef88a9f67">
  <vCard xmlns='vcard-temp'/>
  <error code='404' type='cancel'>
    <remote-server-not-found xmlns='urn:ietf:params:xml:ns:xmpp-stanzas' />
  </error>
</iq>
http://blog.edoceo.com/