Where the petabytes of Facebook data are stored. Leaky Facebook: What Social Network Partners Do with User Data Facebook Audience

Reading time: 7 minutes

Collecting a database using Facebook is sometimes more profitable and more convenient than using a website. And as an additional source of leads, social networks are an ideal platform. Facebook has a tool that collects customer contacts through ads. The advantage of this method is that the client does not need to enter a million fields, the system will automatically pull up email, phone number, name and any other information from the Facebook database. That is, the client sees the advertisement, clicks, an already completed form appears - all that remains is to click the submit button.

We collect the base through social networks

Nobody is choosing between the channels of communication with the client. Instagram, Vkontakte, Facebook, Telegram, YouTube chat bots and others - brands communicate with the client on all possible platforms. Marketers use every opportunity of additional “touch” ". Therefore, it is important that advertising, mailings, and social networks work in conjunction and help, and not interfere with each other.

For example, it is already difficult to find an email that does not include social media buttons with a call to subscribe.


Aviasales

Conversely, social media is a great way to lure readers into your mailing list. What for? Posts in chats or on pages cannot be long, it is difficult to fill them with instructions with screenshots, the video is not always convenient to watch with sound - there are restrictions in each channel.

The letter can be put aside and read at a more convenient moment. In it, you can add any information with merch tags - this is when some information about a client is pulled from your database. For example, how many days have passed since the last purchase, or how many new words he learned from your plateau.

Facebook audience

Set up ads according to the principle “somewhere, let it get there” "Is a bad story. Consider if you know your target audience or if you want broad targeting. If you choose the latter, Facebook's algorithms will themselves search for “your” customers.

If you want to set the targeting yourself, then you have to dig deeper.Custom Audiences on Facebook are divided into:

Or create similar audiences - the algorithm will search for people based on characteristics that match your customers or leads. This can be geo, age, interests - any parameters that you specify.

The targeting options are multifaceted and you can work with each audience separately. Create personalized ads and make personalized offers to each potential customer.

How to set up a Facebook lead collection form?

In the first step of setting up Facebook ads, click on "Lead Generation". Enter the name of the campaign. At the next step, we accept the terms of the use of advertising to generate leads (if you don’t find here link ).

Next, set up to which audience the ad will be broadcast... As for placements, we do not recommend advertising on Instagram, as there the collection form is displayed "crooked" and only annoys users.

Now we create a form for collecting data. It is created and edited in step 3 of creating an advertisement. Click on the "Lead Generation Form".

What data can be pulled from there:

  • contact information (phone, email, name, city, etc.)
  • demographic data (date of birth, gender, marital status, etc.)
  • information about the job (position, company)
  • as well as your own questions (short and alternative answers, date of visit, for example, for areas from HoReCa).

But we are a mailing service

Therefore, you need to get leads to fall on the mailing list. For integration with Facebook we use Zapier service. Mailigen has full integration with this tool.

Often the average user will finish reading at this point. Since the words integration, API, keys, webhooks only repel. There is an exit. That's why this article was written :)

How to integrate Facebook and Mailigen

For this we need:

Go!

First, on the main page, you need to select Facebook Lead Ads and Mailigen as services with which we will work:

Below we will be shown the only app for combining them:


Click on this zap, and go directly to filling it out.

Since we have chosen the services with which we will work - the first two steps have already been filled in for us, just click "Continue" and go to the third step: adding a Facebook account.


Click on the "Connect an Account" button, and issue the necessary permissions on behalf of your profile to administrative rights to the company's Facebook page and Facebook Lead Ads (Important!)

The account has been added to the list. Select it and click "Continue".

The fourth step allows you to select a company page and a Facebook Lead Ads subscription form that will transfer data to Mailigen:


If you can't see the page or form you want, then check your access to the page and Facebook Lead Ads.

The next page allows you to check the creation of leads and whether everything is configured correctly. Create a test lead and check how the Facebook Lead Ads integrations work:


The next item is setting up integration with Mailigen. The first two steps have already been completed for us, just like for Facebook Lead Ads, so let's move on to adding a Mailigen account. You will be prompted to enter an API key, read how to get it.

After entering the key, verifying it and selecting the desired account, go to the next step, where you can configure which list, with which options (for example, with or without double confirmation) and in which fields to load the received data:


If a test lead was created from Facebook Lead Ads, then the fields can be selected directly with the values ​​of the test lead, so as not to be mistaken.

Once you're done setting up all the fields, move on to the next step and test adding a test lead to Mailigen.

If the test was successful, then congratulations, your first app is ready to go, run it!


That's all. You now have a working bundle of your Facebook and Mailigen ads. And you were afraid. Use it to your health!


Although Facebook loses to such social networks as Vkontakte and Odnoklassniki in terms of prevalence among the Russian-speaking audience, it is still a very popular platform for social activity on the Internet. But not everyone knows how this social network actually uses information about its users. Next, we will tell you how this resource monitors us, and what needs to be done to protect your personal data.

How Facebook is following us

In 2014, representatives of this worldwide social network reported that their servers receive approximately 600 terabytes of data every day - with this amount of information, 193 million copies of the book "War and Peace" could be compared. Several years have passed, and there is no doubt that the daily volume of data has increased significantly since then. Imagine how much personal information this company owns!

“Well, what things can a social network learn about me? I am a law-abiding citizen, and I have nothing to hide, ”thinks an ordinary user and automatically ticks the box under the item that describes the resource's privacy policy. But even if everyone had read this document, some information about the use of personal data still remains, as it were, "between the lines."

What exactly has Facebook learned to study the personal data of users? Much more than you might think!

He sees that users were going to write

Perhaps the most interesting and even compromising information is contained in the messages that we typed in the heat of the moment, but for various reasons did not send, or rewrote it differently. And do not think that no one has seen it but you!

The social network actually talked about this "skill" itself, publishing its own study on self-censorship ("Self-censorship on Facebook", 2013), which explained why and how users correct their posts before publishing. It turns out that the system is capable of registering keystrokes while typing. It turns out that once typed personal data may already remain in the database of the social network, even if you erase it.

It transfers personal data to third parties

Facebook secretly examines the unsent data to compile a portrait of the user's personality, but the resource can use the already published information as promised in the license agreement. And this includes not only personality research and any own research - the system transfers your personal data to marketing companies and the American government.

Be aware: even if you did not indicate your mobile phone number or e-mail address in your profile, but some of your friends tried to find you using these data, then the system already knows this information.

Moreover, the social network also collaborates with other sites you visit to collect missing information, for example, about your income, online behavior, etc., and then adjusts the news feed for you to promote targeted advertising.

He tries to identify the user's face

He watches you even when you are offline

Here, too, the inattentively read Privacy Policy of this site makes itself felt, which clearly states that:

The system is able to collect personal data in this way using single sign-on technology and cookies. In addition, the social network is trying, or has already learned to track the movement of the cursor on the screen.

What is the main danger of using Facebook

As we mentioned above, in addition to setting up relevant posts, advertising and sales, personal data of users is transferred to the US government, on the territory of which the social network is registered. But the power of countries is not concentrated in the hands of presidents, prime ministers and other officials - in addition to them there is a secret group of powerful representatives of the richest clans in the world who control all types of industry, the banking system, territorial borders ... This group is denoted by the term "world government" ...

Its goal is to establish a new world order, which implies total control over the population of the planet and all spheres of its life. The ultimate tool for managing a person should be a nanochip, or a laser nanomark, applied to the forehead or right hand, which, according to the Revelation of the Holy Apostle John the Theologian (Rev. 13: 15-18), will signify the arrival of the Antichrist. And the intermediate stage is just the assignment of digital identifiers to the population (INN, SNILS), the introduction of UEC and biometric passports, as well as the very collection and processing of personal data.

Personal data protection on Facebook

The only viable option in this situation would be a complete rejection of social networks. But if this is not yet possible, you should at least try to keep the infringement of your personal data to a minimum, adhering to the following rules:

Unfortunately, not only the social network can track us, but also the operating system - see for yourself how our personal data is used by Windows 10:


Take it for yourself, tell your friends!

Read also on our website:

show more

Not a net, but a sieve

Before the scandal began, information about how third-party applications use the personal data of Facebook users was contained in the company's privacy policy in a rather difficult-to-understand form. Even Zuckerberg himself said in Congress that the majority of the audience does not read this document or does not delve into what it says. Immediately after the start of the investigation, the company began to explain what third-party companies were getting, and promised to tighten the rules for access to information for the latter. From one of these Facebook explanations follows:

  • Until now, any user could find the person he needed by entering his phone number or email into the search bar. This function could have been used by cybercriminals as well.
  • Facebook kept the history of calls and correspondence of the owners of Android smartphones with the Facebook Messenger and Facebook Lite applications installed. The company promised to analyze this function to make sure that the messages from users themselves were not stored. Zuckerberg refuted one of the popular myths about the possibility of eavesdropping on user conversations and then displaying targeted ads (clients made this conclusion, since when installing an application on a smartphone, the social network requests access to the microphone): access to the microphone is needed exclusively for correct video playback.
  • Administrators and members of closed groups could give third-party applications access to the list of group members and their personal data (names; photos attached to posts, comments to them).
  • Third-party applications could read any posts and comments to them through the API (programming interface) of the pages.
  • Until 2014, third-party applications could request information from Facebook not only about the user himself, but also about his friends. After making changes, applications can only receive information about those friends who have agreed to transfer it. In March 2018, Facebook also announced that it would revoke user permission to collect information if the app has not been used for more than three months.

Facebook currently collects two types of data. The first is information that people themselves post on a social network: photos, posts, etc. The second is those that are necessary for targeted advertising. To improve its effectiveness, Facebook is also buying the services of information brokers (data-brokers). The latter collects information from many sources - platforms like Google, Amazon and Facebook, as well as companies operating in industries that relate to the use of data about people (media, retail, telecommunications and finance) - and provide other companies with services related to targeted advertising and scoring - checking borrowers of banks and clients of insurance companies. According to a report from the Cracked Labs research institute, Facebook had six such partners in 2017: Acxiom, Epsilon, Experian, Oracle, CCC Marketing, and Quantium. They helped the platform to better sort and classify its users.

Photo: David Paul Morris / Bloomberg

Facebook does not sell or transfer user data to advertisers. As the representative of the social network explained, they analyze them and then categorize them according to their preferences. If an advertiser wants their ad to be seen by “female cyclists from Atlanta,” Facebook will serve ads to that category of users without sharing their details with third parties. The reports for advertisers contain only generalized information about how successful the advertisement was - how many people and what gender clicked on the banner, and other statistics.

What third-party sites and applications do with public information of Facebook users is not known for certain. It is only clear that this information is collected by many companies.

Test it

Not only did Alexander Kogan use tests to collect information on Facebook, many developers do it. RBC analyzed the privacy policies of some of them.

  • Nametests.com

The site Nametests.com, owned by Socialsweethearts (it offers tests "What has April prepared for you?", "What does your ideal partner look like?" The user agrees to share information about his public profile, friends list, email address and likes. The company's privacy policy states that it stores the requested information in anonymous form and uses it to compile statistics and improve the site. The use of data without anonymization is allowed only in cases provided for by law, as well as for the purposes necessary to ensure the functioning of the service, security and optimization, stated in the documents of Socialsweethearts.

After deleting the account, user data will also be deleted, according to a Socialsweethearts spokesperson. “We do not analyze and do not conduct research on data for political and other similar purposes, and we also do not cooperate with companies or organizations engaged in such research,” he said.

According to him, now Socialsweethearts are preparing to comply with the General Data Protection Regulation (GDPR), which will enter into force in the European Union on May 25, 2018. “We understand, given the news surrounding Facebook, that the confidence of users [in the safety of their personal data] is very important and at the same time, the processes associated with their personal data should be transparent,” says a spokesman for Socialsweethearts.

  • Playbuzz

Playbuzz, which also owns a website with tests, is also preparing for the introduction of the GDPR, a representative of this company said. The current version of Playbuzz's privacy policy states that the platform may collect personal information that is entered upon registration; information about the device from which the user visits the site; as well as the answers from the passed tests. In addition, Playbuzz collects personal information from users from third parties for marketing purposes, and may also transfer anonymized information about people in a generalized form to its partners for advertising purposes.

A representative from Playbuzz noted that due to the fact that the platform's content is monetized, some of the company's partners, as well as third-party providers (such as fraud detection services), may collect data from some end users (for example, IP addresses). “This data is not available to Playbuzz and is not stored on our servers,” he added.

Even if the user deletes his page on the site, Playbuzz reserves the right to transfer his personal data to third parties, its business partners, for non-marketing purposes (for example, to contact the user).

  • Brainfall media

The service agreement of Brainfall Media (engaged in online research and also collects personal data on Facebook) says that the company considers information about users as a business asset and has the right to transfer it to third parties with the consent of users. The company did not respond to RBC's request.

Spies on smartphones

Web sites with visit trackers and mobile apps are black holes: no one can truly assess who they are sharing data with, according to a Cracked Labs study. In 2015, a study of popular apps in Australia, Brazil, Germany and the United States by the NICTA Research Center and the University of New South Wales found that 85-95% of free and up to 60% of paid apps collected user information for the benefit of third parties. RBC journalists analyzed applications that collected information from their Facebook accounts. Among them were programs of several well-known developers.

“Access to general profile information and email address is provided automatically to all accredited applications. Permission to request this data is included in the minimum basic set of Facebook for application developers, and the social network does not have a narrower request, ”Stepan Danilov, founder and CEO of the MeYou networking service, explained to RBC. Basic permissions do not require developer verification, but everyone else claiming more information does, according to Facebook's "permission help" for developers.

Apps from the Rambler Group developer, such as LiveJournal and Afisha-eda, also requested information about the user's city of residence and hometown, and access to publications in the chronicle. The representative of the Rambler Group press service explained that clients of their media resources can log in, including through Facebook. This method of authorization allows you to fully use the capabilities of applications, for example, participate in polls, leave comments, etc. “For our part, we get the potential opportunity to work with BigData and, in the future, set up“ smart targeting ”, increasing the efficiency of interaction with advertising media for both users and advertisers. Ideally, people are ready to interact exclusively with the advertisements that may be of potential interest to them. On the other hand, the advertiser gets in touch with a potentially highly motivated user, ”he added.

The Amediateka TV show application, among other things, gains access to the client's friends list. “The list of friends is not used at the moment, but it is intended to update the recommendation system based on the interests of the user's friends,” said Milana Bogatyreva, a representative of Amedia TV.

Some apps requested access to Facebook users' status updates, photos and videos. For example, TripAdvisor. The Nokia app had access, among other things, to data about marital status, jobs, preferences, education, religious and political beliefs, and other information. Representatives of TripAdvisor and HMD Global (owns the rights to the Nokia brand) did not answer RBC's questions.

Custom soul collectors

Facebook is not the main source of user data. In the Cracked Labs study, information brokers are identified as the main sources. Cracked Labs experts named Acxiom and Oracle as the largest such companies. For example, Acxiom has been collecting consumer data for decades from public sources: telephone directories, court records, crime reports, various registries, questionnaires, surveys, etc. Later, digital sources were added to this, for example, large IT companies, whose software allows you to analyze telephone conversations , financial transactions, Internet activity, etc. to identify criminal and terrorist activity.

In addition, Acxiom partners with Ibotta (collects purchase data using information from loyalty cards or receipts), Samba TV (collects TV viewing data through programs installed on set-top boxes or video-on-demand platforms), Crossix (collects medical information, including medical history, doctor's appointments, prescriptions, etc.), FreckleIOT (data on the location of a person in real time: in various stores, airports, bars, etc., special sensors can be installed with which it can communicate the user's smartphone and send information) and other companies that mainly operate in the United States. Acxiom stores this information in the form of a unique anonymous ID - a kind of code that is associated with a postal address, phone number, email, IP address, geolocation, cookie, device ID. Each unique ID Acxiom is assigned several categories to which the person corresponds. A client can give Acxiom a consumer's email and ask for information on what categories the information broker categorizes.

There is no unified system for assessing the size of the user data market. According to a study by 451 Research, the volume of the global data market of telecommunications companies alone in 2015 amounted to $ 24 billion, and by 2020 it should increase to $ 79 billion. Mobile operators in at least ten countries (Russia was not among them) were noticed in the fact that installed a special mechanism to track the behavior of subscribers when surfing the Internet. Moreover, surfers could not block such "super-cookies".

While user data is currently being used to sell targeted ads and scoring, it may find other, less secure uses in the future. For example, the data can be used to dynamically change the prices of goods on the website of an online store, depending on who visits it. It can be either a price reduction, if the system considers this user a valuable consumer for the company in the long term, or an increase depending on how much a particular user is willing to pay for the item at the moment. With the help of personalization, companies can try to influence consumer behavior, show him ads at a certain point in order for him to make a purchase.

Facebook is a real phenomenon. The largest social network in the world is estimated at one hundred billion dollars. It has over a billion users. But storing data, photos and messages for more than a seventh of the world's population requires advanced technology. So how is this done?

Northern California. Valley of computer giants. Here is the name that attracts the most tourists - Facebook.

This social network, invented by Harvard students in 2004, lets your friends know what you are doing with a click of the mouse. For many, there is nothing cooler than network communication. Eight years after the birth of the company, it went public at an incredible $ 104 billion. It is noticeable that Facebook was created by students. They do everything in their own way. Graffiti and touch screen all over the wall. Vending machines sell boxes of gadgets instead of cans of drinks. Open bars and video games for staff with an average age of 26. It looks like this strange environment is at work. People from all over the world visit Facebook. Every six months their number increases by 100 million. Processing the personal data of so many people is no easy task.


An employee of the company says: “We have one engineer for every million users. We are working on an unprecedented scale. "

They cannot take advantage of someone else's experience. Because before, no site had such a number of visitors. And when you have more users than machines in the world, one of the biggest concerns is storage. Your laptop hard drive fits in your hand. Something bigger is needed here.

In Primeville, Oregon, there is a huge data center - 28,000 square meters.

It's like a memory stick the size of three football fields, worth hundreds of millions of dollars. This is where your information is stored. On the latest servers, in vast memory banks, between which data travels at the speed of light over fiber-optic cables almost 6.5 thousand kilometers long. Says Cam Patchet, general manager of the data center: “When you enter the address facebook.com, your request goes to the Internet and then here, and here it is requested one of the Facebook servers. Your profile, all data associated with it, is processed and compiled by our data centers, and sent back to you via the Internet. All this happens in milliseconds. Some people think of the Internet as something like a cloud floating in the sky. But no, it is a material thing. The Internet is computers, servers and data centers connected by kilometers of cable all over the world. All these units can communicate with each other and share data. "

If you want to visualize the internet, these endless rows of servers are a great illustration. Compared to this place, the supercomputer looks like a pocket calculator. It is supplied with 30 megawatts of electricity, so electricity is always available.

But just like not having a backup of your computer's data, a power outage can be a disaster here. For millions of teenagers, a world without social networks is simply unimaginable. So there are huge diesel generators at the ready. In the event that the main line in the building is disconnected, these generators will come on line. Employees are constantly monitoring them. They generate 3 megawatts each, and there are 14 of them.

Another problem: this whole technique generates a huge amount of heat. Without cooling, these servers will fail. The home computer's processor is cooled by a cooler a bit larger than a matchbox.

Here for this there is an extensive seven-room penthouse - a modern system of natural air conditioning. Cold air from the highlands of Oregon is drawn in, filtered and mixed with warm air to regulate the temperature in the data center.

Suspended water, which is sprayed by the nozzles, controls the humidity.

Cooled air is supplied from the back of the servers to prevent overheating. And finally, the excess warm air is drawn out by huge fans, a hundred times larger than that of a home computer.

More fans will be needed soon because the social network is just boiling over. Almost 600 million people visit the site every day. This is almost double the population of the United States. And the site continues to grow. A thousand new servers are brought here every day. Tom Furlong is in charge of data centers. When I started working 4.5 years ago, he says, we had 27 million users and several thousand servers. Today we got a thousand servers, and I barely noticed it.

Huge trucks come here. They don't deliver food. They are bringing in more and more server memory. Most of us are familiar with gigabytes and terabytes. Here the account goes to petabytes. More than 100 petabytes of photos and videos are stored on Facebook servers, and there are more of them every day. This is an incredible amount of information.

Every day, a data center receives 100,000 times more data than the hard drive of an advanced personal computer can hold. Each server rack has 500 terabytes, more than 130 billion times more than Apple's first PC. And if a server goes down, technicians like David Gaylard are tasked with finding the needle in the digital haystack.

The hard drive is out of order and he goes in search of the right rack in the maze of buzzing servers. After finding the counter, David replaces the entire board in the time it takes you to update the status. But David and other techniques are not omnipotent. Worldwide, nearly 2.5 billion people have access to the Internet. And everyone spends 20% of their time online on social media, uploading hundreds of millions of photos, posts, and updates every day. With such activity, even in such a huge data center, space is running out. The builders are already working on increasing the capacity. But with this scale of network activity, they'd better hurry.

The film "Social Network" is a good illustration of the phenomenon of Facebook's development,
who managed to gather a fabulous, previously unthinkable audience in record time.
However, one more component of the project remained behind the scenes - how it works.
from the inside. Its technical device.

What is Facebook now? This is best demonstrated by dry numbers:

  • 500,000,000 active users (monthly audience);
  • 200,000,000,000 page views per month;
  • 150,000,000 cache hits per second;
  • 2,000,000,000,000 objects in the cache;
  • 20,000,000,000 photos in 4 resolutions. They would be enough to
    cover the surface of the earth in 10 layers - this is more than all others
    photo resources combined;
  • more than 1,000,000,000 chat messages every day;
  • more than 100 million search queries daily;
  • more than 400,000 developers of third-party applications;
  • about 500 developers and system administrators in the state;
  • more than 1,000,000 active users per engineer;
  • tens of thousands of servers, tens of gigabits of traffic.

How does it all work?

Scalability, simplicity, openness

You can treat social networks in general and Facebook in different ways.
particular, but from the point of view of manufacturability, this is one of the most interesting
projects. It's especially nice that the developers never refused to share
experience in creating a resource that can withstand such loads. There is a big
practical benefits. After all, the system is based on publicly available components,
which you can use, I can use - they are available to everyone.
Moreover, many of the technologies that were developed internally by Facebook,
are now open source. And use them, again, maybe
anyone who wants. The developers of the social network, whenever possible, used only
open source technologies and the Unix philosophy: every component of the system must be
as simple and productive as possible, while solving problems is achieved by
combining them. All efforts of engineers are aimed at scalability,
minimizing the number of points of failure and, most importantly, simplicity. Not to be
unfounded, I will indicate the main technologies that are now used internally
Facebook:

I think it will be most interesting to hear how the project managed to
use the most familiar technologies. And there really is a lot
nuances.

What usually happens in 20 minutes on Facebook?

  • People post 1,000,000 links;
  • Celebrate friends for 1,323,000 photos;
  • Invites 1,484,000 acquaintances to events;
  • Send 1,587,000 messages to the wall;
  • Write 1,851,000 new statuses;
  • 2,000,000 pairs of people become friends;
  • 2,700,000 photos are uploaded;
  • 10,200,000 comments appear;
  • 4,632,000 private messages are sent.

PHP project

This begs the question: why PHP? In many ways - just "historically
developed. "It is well suited for web development, easy to learn and work,
there is a huge assortment of libraries available for programmers. In addition, there is
huge international community. On the negative side, high
consumption of RAM and computing resources. When the amount of code became
too large, weak typing, linear growth were added to this list
costs when connecting additional files, limited opportunities for
static analysis and optimization. All this began to create great difficulties. By
for this reason, Facebook has implemented a lot of improvements to PHP, including
bytecode optimization, improvements in APC (lazy loading, optimization
locks, cache warming up) and a number of native extensions (memcache client,
serialization format, logs, statistics, monitoring, asynchronous mechanism
event handling).

Scheme of forming a news feed

The HipHop project deserves special attention - it is a source code transformer
from PHP to optimized C ++. The principle is simple: developers write in PHP,
which converts to optimized C ++. The add-in implements
static code analysis, data type detection, code generation and much
other. Also, HipHop makes it easier to develop extensions, significantly reduces
the expenditure of RAM and computing resources. A team of three
programmers took a year and a half to develop this technology, in particular, it was
most of the interpreter and many PHP language extensions have been rewritten. Now
HipHop codes are published under an opensource license, use your health.

Facebook development culture

  • Move quickly and not be afraid to break some things;
  • big influence of small teams;
  • be outspoken and innovative;
  • return innovation to the opensource community.

Improvements to MySQL

Now about the database. Unlike the vast majority of sites, MySQL in
Facebook is used as a simple store of key-value pairs. Big
the number of logical databases is distributed across physical servers, but
replication is used only between data centers. Load balancing
carried out by the redistribution of databases across machines. Since the data
distributed almost randomly, no JOIN operations,
combining data from several tables are not used in the code. This is
meaning. After all, it is much easier to increase computing power on web servers,
than on database servers.

Facebook uses almost unmodified MySQL source code,
but with their own partitioning schemes for globally unique
identifiers and archiving based on the frequency of data access.
The principle is very effective as most requests are for the freshest
information. Access to new data is optimized as much as possible, and old records
are automatically archived. In addition, they use their own libraries for
access to data based on a graph, where objects (vertices of the graph) can only have
limited set of data types (integer, limited-length string, text),
and links (edges of the graph) are automatically replicated, forming an analogue of distributed
foreign keys.

Using Memcached

As you know, memcached is a high performance distributed hash table.
Facebook stores "hot" data from MySQL in it, which significantly reduces
load at the database level. More than 25 TB in use (just think about
figure) of RAM on several thousand servers with an average time
response less than 250 μs. Serialized PHP data structures are cached, with
due to the lack of an automatic mechanism for checking the consistency of data between
memcached and MySQL have to do this at the code level. The main
the way to use memcache is a lot of multi-get requests,
used to obtain data at the other end of the graph edges.

Facebook is very actively involved in finalizing the project on issues
productivity. Most of the improvements described below have been included in
opensource version of memcached: port on 64-bit architecture, serialization,
multithreading, compression, memcache access via UDP (reduces the
memory due to the lack of thousands of TCP connection buffers). In addition there were
some changes have been made to the Linux kernel to optimize memcache.
How effective is it? After the above modifications, memcached is capable of
perform up to 250,000 operations per second compared to the standard 30,000 - 40
000 in the original version.

Thrift framework

Another Facebook innovation is the Thrift project. In fact,
it is a mechanism for building applications using multiple languages
programming. The main goal is to provide technology for transparent
interactions between different programming technologies. Thrift offers
developers a special language for describing interfaces, a static code generator,
and also supports many languages, including C ++, PHP, Python, Java, Ruby,
Erlang, Perl, Haskell. The choice of transport is possible (sockets, files, buffers in
memory) and serialization standard (binary, JSON). Various types supported
servers: non-blocking, asynchronous, both single-threaded and multi-threaded.
Alternative technologies are SOAP, CORBA, COM, Pillar, Protocol Buffers,
but they all have their significant drawbacks, and this forced Facebook to develop
your own. Thrift's big advantage is performance.
It is very, very fast, but even this is not its main advantage. With the advent of Thrift

Information about Facebook's interaction with the opensource community of these and
other projects located on

Share this