Google App Engine at Stack Overflow Dev Days DC

Jonathan Blocksom
October 30, 2009

Hi. On Monday October 26, 2009, I gave a talk at the Stack Overflow Dev Days conference in Washington DC on Google App Engine. I'm currently a Software Engineer at Google, working in the DC office in our new Public Sector Projects team. We're responsible for things like Google Moderator, All for Good, and the Google Polling Place Locator.

We use App Engine a lot in our team so I volunteered to give this talk at the conference. I don't work on App Engine internals and I'm not from developer relations. Keep that in mind as you look through the slides.

We used Google Moderator to let the audience ask questions, which seemed to work out pretty well. I didn't have time to address all the questions during the conference so I've spent some time since then putting up answers. All the questions and answers are below for your convenience -- in some ways they're more useful than the slides.

By the way, if you really want to learn more about App Engine, check out the Google I/O session videos from 2009. The ones from 2008 are also good, look under the "APIs & Tools" track.

PDF copy of slides presented by Jonathan Blocksom
App Engine Dev Days DC 20091026
View more documents from jblocksom.

Questions and Answers

Original questions submitted on Google Moderator

How does it help me copy my dna?

(This was an inside joke related to an earlier talk.) Ummm…. App Engine scales, so you don’t have to wear a pager, so there’s less radiation in your nether region, so your DNA copying mechansim works better?

Does Google App Engine scale well for long-running, asynchronous tasks?

Probably not—all tasks have a time limit of 30 seconds. But if your long-running asynchronous task can be broken up into small subparts that handle a unit of work and queue up another small task to handle the next part then it might be appropriate.

Note the task queue currently has a limit of 100K tasks/day and 20 invocations/sec.

What is MegaStore? How is is different from BigTable?

I posted this question on Stack Overflow to see what the community would come up with: http://stackoverflow.com/questions/1628421/how-is-megastore-different-from-bigtable. Two answers pointed to this article: http://perspectives.mvdirona.com/2008/07/10/GoogleMegastore.aspx.

Alex Martelli, who works at Google and is well known in the Python community, pointed out that most of the differences will not be apparent to an App Engine user as GAE already offers or encapsulates the various new services.

The App Engine blog post that talks about the transition is here: http://googleappengine.blogspot.com/2009/09/migration-to-better-datastore.html, and includes some details on the differences between the two and the motivation for the transition.

Very cool. How were you able to talk about scaled computing for so long without using the term “cloud computing”?

I’m an engineer, not a salesman. Is it true that you can’t have a cloud without a lot of hot air?

Are there any sample/open source python apps available to use as a starting point?

There’s a whole bunch here: http://code.google.com/p/google-app-engine-samples/. The one named simpleajaxchat might be a good place to get started.

Are you planning to sell app engine as a product similar to the search appliance. So that paranoid people can host it in their own datacenter. Maybe in the form of the rack that we can plug-in and have in-house App engine install.

No plans for this have been made public as far as I know. You could probably roll your own using appscale, an open source app engine clone (http://code.google.com/p/appscale/).

Are there any plans to support PHP or Ruby? If so, do you have a timeline?

None are listed in the App Engine roadmap (http://code.google.com/appengine/docs/roadmap.html). Ruby on Rails with JRuby looks like it’s possible, check out http://jruby-rack.appspot.com/.

How are backups performed? Do I need to worry about this? I.E. SideKick

Good question. No need to worry about us sidekicking your data, first because all of App Engine’s data is replicated across data centers. Built in background replication is a feature of BigTable, and GAE’s data cells are replicated to insure we don’t lose anything. When something fails, there can be a little data loss from the most recent transactions which haven’t been replicated yet; minimizing this is one of the reasons App Engine is switching to Megastore behind the scenes (see http://googleappengine.blogspot.com/2009/09/migration-to-better-datastore.html).

You can also back up your data yourself. The app engine deploy script includes commands to bulk download existing data from App Engine and bulk upload to it. This means you can make and restore your own private backups. This comes in especially handy when you’re deploying a new version of an app, especially if you’ve changed your data schema around.

Are there plans to support portability? Suppose I want to switch to a competitor. In other words, what steps are there to avoid vendor lock-in?

The Data Liberation Front (an internal Google effort to make sure our users aren’t locked in) has been keeping up with App Engine, and they’ve made sure the bulk data download and upload is working. Here’s their page on App Engine: http://www.dataliberation.org/google/app.

You may also be interested in appscale, an App Engine clone mentioned in another question.

How much does it cost?

Basic use is free, this includes:
1.3M requests/day
1GB traffic in and out/day
6.5 CPU hours/day

More than that and you have to pay. Prices currently are:
1 GB traffic in: $0.12
1 GB traffic out: $0.10
1 CPU hour: $0.10
1 email: $0.0001

If you compare with Amazon’s EC2 you’ll see these prices are pretty comparable. A big difference with App Engine is that if you’re not receiving any traffic you don’t have a virtual server sitting around idling and charging you money. With App Engine you only pay when your app runs.

Full details on quotas and pricing at http://code.google.com/appengine/docs/quotas.html and http://code.google.com/appengine/docs/billing.html.

Do you know of any high-traffic sites that use AppEngine?

Best Buy is using or has used Giftag, a service which runs on App Engine. The surprisingly popular Facebook application BuddyPoke runs on App Engine. Panoramio, a geotagged photo site owned by Google, is running on App Engine.

What skill sets / attributes are you looking for in potential employees?

As a whole Google looks for great software engineers. In the DC office we also need ones who work well on a small team, can build large reliable systems, and have a passion for the public sector.

How many nodes do you run memcached over for each user, and how much memory is dedicated to each?

The App Engine memcache service is a completely different implementation than memcached, and while the APIs are the same or very similar the underlying implementation is very different. The amount of space available for your app in memcache varies and depends on a number of factors, including traffic.

Note you can find out how much data you have stored in your memcache using the get_stats() method; see http://code.google.com/appengine/docs/python/memcache/clientclass.html#Client_get_stats.

What are the most common uses of App Engine (what types of projects) do you see?

App Engine is used very heavily inside Google, so I personally see a lot of internal apps based on it. I expect it will become quite popular for internal tools at companies using Google Apps for their domains, partly because it makes the authentication piece mostly trivial.

Outside of Google it’s quite popular for running Facebook applications. It seems a good fit for Facebook apps because they have small request sizes and generally don’t involve lots of back end processing. Plus since it’s free to get started you can let your app get some traffic before you incur any costs.

Outside of FB it’s used for, well, web apps, which encompasses pretty much anything! Check out the App Gallery for some examples: http://appgallery.appspot.com.

Any thoughts on AppScale, the open-source hosting platform for appengine apps?

Yes, if they would call their product an “Open Source Platform” instead of an “Open Cloud Platform” then they’d be easier to find when you search on “Open source app engine”. Besides, WTF is an Open Cloud? Have you ever seen a closed cloud?

More seriously, thanks for bringing them up, I knew there was a reasonable App Engine clone out there but all I could find in my searches was AppDrop.com which doesn’t seem to be around anymore.

I think it’s great that AppScale is out there, and it looks like others in Google agree because we’ve apparently we’ve given them some grants. In addition to not being locked in, it could probably be a super handy debugging tool, especially if you run into a problem that’s actually on App Engine’s side of things.

I’d also love to see if they could push the envelope a bit—for example, if they added full text search, AppScale might actually be a compelling replacement for App Engine and not just a clone.

Here’s some info about App Scale: http://appscale.cs.ucsb.edu/
And here’s their page on Google code: http://code.google.com/p/appscale/

Would you hire former Microsoft developers?

Sure, in fact my manager used to work for Microsoft. You can even run Windows if you want.

Can your local dev sandbox vm be run I’m any other context than localhost? In other words, can we have the option of developing for app engine, and hosting ourselves, as with VMWare or VirtualBox?

You can specify a different host name and port when launching the local app server from the command line. However, to my knowledge the development app server is not really production grade, so if your goal is to run an app you might look in to AppScale, an open source App Engine clone that can run on EC2 or Xen.

Why haven’t I heard of this before???

Because you have a sack on your head.