How Facebook Moved 20 Billion Instagram Photos Without …
Since 2010, Instagram had run atop Amazon EC2, the seminal cloud computing service that lets anyone build and run software without setting up their own computer servers. To seamlessly move Instagram into an east coast Facebook data center–likely the one in Forest City, North Carolina–Cabrera’s team first created what essentially was a copy of the software underpinning the photo-sharing service. Once this was up and running in the Facebook facility, the team could transfer the data—including those 20 billion process was trickier than you might expect. It involved building a single private computer network that spanned the Facebook data center and the Instagram operation on Amazon’s cloud–the best way of securely moving all of the data from one place to another–but the team couldn’t build such a network without moving Instagram to another part of the Amazon cloud. In other words, Krieger’s crew had to move Instagram once and then move it again. “We had to completely replace the car twice in the last year, ” he, they moved it into Amazon’s Virtual Private Cloud, or VPC, a tool that let Krieger and his crew create a logical network that reached beyond Amazon into the Facebook data center. Creating this network was particularly important because it gave Facebook complete control over the internet addresses used by the machines running Instagram. If they hadn’t moved Instagram onto the VPC, they wouldn’t have been able to define their own addresses on Amazon, he says, which would mean dealing with myriad address conflicts as they moved software into the data things were even more complicated than that. The added wrinkle was that, in order to first move Instagram from EC2 to VPC, they also needed to build a common network across those two environments. Amazon doesn’t offer a way of doing that. So, as a temporary fix, Facebook built its own networking tool, something it calls Neti. The long and the short of Neti is that it was yet another extensive step in this year-long process–and therein lies the biggest lesson for those who might build atop Amazon and other cloud didn’t exist when Instagram was founded in 2010. Today, if other startups build on VPC from the beginning, they can avoid the extras steps that complicated Instagram’s migration. VPC also can help if you want to move just part of your infrastructure from the cloud into a private data center. “If I was starting a new startup or service from scratch today, ” Krieger says, “I would totally just start on VPC. “Once Krieger and his engineers were ready to actually move software and data from place to place, they turned to an increasingly popular tool called Chef. This is a way of writing automated “recipes” for loading and configuring digital stuff on a vast array of machines. They wrote recipes, for instance, that could automatically load the appropriate software onto machines running in the Amazon VPC. Then they used similar recipes to load much the same software on machines inside the Facebook data center. It built recipes for installing software on each flavor of Instagram database server, another for configuring what are called caching servers, which are used to more quickly serve up particularly popular photos, and so on.
How Instagram is scaling its infrastructure across the ocean
In 2014, two years after Instagram joined Facebook, Instagram’s engineering team moved the company’s infrastructure from Amazon Web Services (AWS) servers to Facebook’s data centers. Facebook has multiple data centers across the United States and Europe but, until recently, Instagram used only U. S. data centers.
The main reason Instagram wants to scale its infrastructure to the other side of the ocean is that we have run out of space in the United States. As the service continues to grow, Instagram has reached a point in which we need to consider leveraging Facebook’s data centers in Europe. An added bonus: Local data centers will mean lower latency for European users, which will create a better user experience on Instagram.
In 2015, Instagram scaled its infrastructure from one to three data centers to deliver much-needed resiliency—our engineering team didn’t want to relive the AWS disaster of 2012 when a huge storm in Virginia brought down nearly half of its instances. Scaling from three to five data centers was trivial; we simply increased the replication factor and duplicated data to the new regions; however, scaling up is more difficult when the next data center lives on another continent.
Infrastructure can generally be separated into two types:
Stateless service is usually used as computing and scales based on user traffic (on an as-needed basis). The Django web server is one example.
Stateful service is usually used as storage and must be consistent across data centers. Examples include Cassandra and TAO.
Everyone loves stateless services—they’re easy to deploy and scale, and you can spin them up whenever and wherever as you need them. The truth is we also need stateful services like Cassandra to store user data. Running Cassandra with too many copies not only increases the complexity of maintaining the database; it’s also a waste of capacity, not to mention having quorum requests travel across the ocean is just…slow.
Instagram also uses TAO, a distributed data store for the social graph, as data storage. We run TAO as a single master per shard, and no slave updates the shard for any write request. It forwards all writes to the shard’s master region. Because all writes happen in the master region (which lives in the United States), the write latency is unbearable in Europe. You may notice that our problem is basically the speed of light.
Can we reduce the time it takes a request to travel across the ocean (or even make the round trip disappear)? Here are two ways we can solve this problem.
To prevent quorum requests from going across the ocean, we’re thinking about partitioning our dataset into two parts: Cassandra_EU and Cassandra_US. If European users’ data stores are in the Cassandra_EU partition, and U. users’ data stores are in the Cassandra_US partition, users’ requests won’t need to travel long distances to fetch data.
For example, imagine there are five data centers in the United States and three data centers in the European Union. If we deploy Cassandra in Europe by duplicating the current clusters, the replication factor will be eight and quorum requests must talk to five out of eight replicas.
If, however, we can find a way to partition the data into two sets, we will have a Cassandra_US partition with a replication factor of five and a Cassandra_EU partition with a replication factor of three—and each can operate independently without affecting the others. In the meantime, a quorum request for each partition will be able to stay in the same continent, solving the round-trip latency issue.
Restrict TAO to write to local
To reduce the TAO write latency, we can restrict all EU writes to the local region. It should look almost identical to the end user. When we send a write to TAO, TAO will update locally and won’t block sending the write to the master database synchronously; rather it will queue the write in the local region. In the write’s local region, the data will be available immediately from TAO, while in other regions, the data will be available after it propagates from the local region. This is similar to regular writes today, which propagate from the master region.
Although different services may have different bottlenecks, by focusing on the idea of reducing or removing cross-ocean traffic, we can tackle problems one by one.
As in every infrastructure project, we’ve learned some important lessons along the way. Here are some of the main ones.
Don’t rush into a new project. Before you start to provision servers in a new data center, make sure you understand why you need to deploy your service in a new region, what the dependencies are, and how things will work when the new region comes into play. Also, don’t forget to revisit your disaster recovery plans and make any necessary changes.
Don’t underestimate complexity. Always build into your schedule enough time to make mistakes, find unplanned blockers, and learn new dependencies that you didn’t know about. You may find yourself on a path that would inadvertently restructure how your infrastructure was built.
Know your trade-offs. Things always come with a price. When we partitioned our Cassandra database, we saved lots of storage space by reducing the replication factor. However, to make sure each partition was still ready to face a disaster, we needed more Django capacity in the front to accept traffic from a failing region because now partitions can’t share capacity with each other.
Be patient. Along the way of turning up the European data centers, I don’t remember how many times we said, “Oh, crud! ” But things always get sorted out, eventually. It might take longer than you expect, but have patience and work together as a team—it’s a super-fun journey.
Sherry Xiao will present Cross Atlantic: Scaling Instagram Infrastructure from US to Europe at LISA18, October 29-31 in Nashville, Tenn.
Where are the servers located? – BrowserStack
The devices and machines used for testing website and mobile apps are hosted in US, Ireland, and The Netherlands. You are automatically directed to the nearest location based on your IP. Our application, processing, data collection, and other supporting servers are hosted in the US, Ireland, London, Singapore, Australia, servers are owned by reputable vendors. Each vendor has been thoroughly vetted, and each implement stringent security policies. Find more information about our security.
How secure is my payment?
Frequently Asked Questions about instagram data center
Does Instagram have a data center?
In 2015, Instagram scaled its infrastructure from one to three data centers to deliver much-needed resiliency—our engineering team didn’t want to relive the AWS disaster of 2012 when a huge storm in Virginia brought down nearly half of its instances.Oct 22, 2018
Where are Instagram’s servers located?
Our application, processing, data collection, and other supporting servers are hosted in the US, Ireland, London, Singapore, Australia, India. The servers are owned by reputable vendors.
Does Instagram have its own servers?
Picture-based social media service Instagram has been run entirely on AWS since its inception in 2010. It ran on cloud computing service Amazon EC2, which enabled it to build and run its own software without needing its own servers.Jul 1, 2014