How the smart tape, the "Recommendations" section and the "Prometheus" algorithm work: unique information from the VK team

We asked the representatives of the VKontakte team topical questions about Prometheus, smart tape and their plans for the future. Sergey Paranko, director of the media ecosystem, and Andrei Zakonov, director of growth and social network research, answer. Hurry up to get acquainted with the unique information from the original source.

How the smart tape and the VKontakte recommendation section work

1. How do your algorithms understand that the post is interesting to a specific user? Can you list the factors that most affect the coverage of a record? The quality of the content, the response of the audience - this is understandable. Anything less obvious?

We analyze the user's activity in the feed and raise those posts with which he is more likely to interact: put "Like", write a comment, "stick", open a photo, read an article, watch a video. These actions are very different, we predict them separately. Further, all this is taken into account by a general algorithm, which is based on all factors and performs the ranking.

2. At your performance on "MEN2"you said that even small things like the weather and the mood of the user influence the coverage of the post. How does the tape understand this?

Behind any entry is a huge number of signs of ranking. Each component is a specific value that describes the post itself, its author, the time of day at the time of viewing the recommendations, the speed of the Internet connection of the reader and many other criteria.

If a user has entered the news feed or recommendations section, being in a deep forest and with a poor Internet connection, we will not show him a bulky video at the beginning of the tape. And the algorithm takes into account the day of the week and the time of day. So in the morning tape of recommendations there will be more news and informational content, and in the evening - more media and entertainment.

3. Do you have a document that lists all the factors that the algorithm considers? Or has he already learned so much that even you do not know by what parameters he gives the coverage of a particular record?

We know a list of factors, but everything is unique for each user-record pair. Therefore, just looking at the post, it is impossible to say how good it is, it's subjective. I post about local news in Moscow in St. Petersburg is absolutely irrelevant, and you may not be interested in my news feed about machine learning and social networks.

4. Is it possible to artificially increase the coverage of the record by purchasing likes and shas on cheat services?

Not worth it. In the long run, this will definitely lead to a decrease in coverage. In addition to the likes and sheyras not playing a central role in ranking the post, such cheating will be detected with high probability by us.

However, the traffic to the post from the references of other social networks, the truth can improve the organic coverage of the record and there is no penalty for this.

5. Is there an optimal text length for maximum coverage in smart tape? For example, in the announcement of the article - no more than 500 characters. And in the longrid, created in the editor, no more than 10,000?

There is no optimal length apart from the focus of attention of the reader. Everything is based on engagement, if you have an informational message, try to write succinctly, in the info-style. If you are trying to engage a person to click on a snippet - the same principle is capacious and engaging. Just do not fall into clickbate, it will negatively affect the reach.

If you are telling a story, carry the person around. All large forms, conventionally more than 1,000 characters are better to typeset in the editor of articles, breaking the text of multimedia inserts.

6. Smart tape gives preference to high-quality video (compared to low-quality). And what is a "quality video" for "VKontakte"? What characteristics does it have?

Videos 720-1080 p.

7. Does VK try to detect fake news? Is the truthfulness and accuracy of the data important, or is it just the interest of the audience in the recording?

VKontakte has no content pre-moderation - just like in other sites, users can freely publish and discuss topics of interest to them. At the same time, any person can report inappropriate, abusive or inaccurate content in his opinion using the “Report” button. We consider all complaints without exception. Vkontakte has one of the largest moderation services and we respond as quickly as possible.

8. How do you feel about contests for the most active subscribers and contests for subscription + repost? VC rules they are allowed, but does it somehow affect the coverage of records (positive or negative)? Do you recommend using contests for promotion in VK?

This is a legal way to promote, but you need to keep in mind that this way you attract to your subscribers an irrelevant audience. This can have a negative side effect - subscribers who are not involved in the main content of your group can have a bad effect on the reach and virality of your posts.

9. Articles created in the editor, get more coverage than articles leading to a third-party site. What to do commercial sites and blogs in such conditions? How do they attract VKontakte users to their site?

A group of business in any social network is first and foremost a community. If you treat it as a media asset, you will succeed. Forcing a user to another site, without providing an alternative, in communication can upset a person. As they say: you want to sell me your product, but you do it without respect.

On the VK site you can communicate with customers / readers, sell products, receive money orders and much more. We develop our ecosystem and make it as user friendly as possible. And in this regard, it is also beneficial for business to develop its platform within the VC. The article editor is another tool for communicating and building a community.

10. Will wiki articles now get less coverage than articles made in the editor?

Decides not the format, and user interaction with your content. An interestingly told story with lots of interactions from readers will get coverage regardless of the format of the feed. Another thing is to keep the attention of the audience more convenient with the tools available in the article editor.

Specifically about wiki pages, they do not fall into the "Recommendations" section on mobile devices, this is in principle a desktop tool, so yes, their coverage is lower than that of other feed formats.

11. Some users complain that they do not see the interesting and necessary posts. Why is this happening? And do you have any thoughts about follow the Facebook path - make it so that in the user feed there is more personal content, and not commercial?

We are constantly experimenting with different ranking models, and the main metric is user engagement. You can always send explicit signals to a neuron by hiding a couple of posts from unwanted public documents.

12. You say that you can not buy bots at all. How then to get the first audience of the community? If you sign on relatives and friends, it turns out that the audience is still untargeted. And if you buy advertising, no one will subscribe to an empty community. How to find a way out?

People come to you for content. Therefore, your community will not be empty if you tell there interesting things to the target audience. Advertising for CA is an ideal start. So you get yourself a core audience that will help you grow organically.

13. How to understand that a penalty was imposed on the community? How to remove it?

Looking at the statistics, this can be seen in the reduced coverage. Analyze what you have been doing in recent days, which can be regarded by the platform as undesirable actions: click on the site, aggressive output by the audience, non-original content, too frequent posting. Stop doing it. And it will take place in two days :-)

14. Are there filters based on the subject? (For example, "adult content" goes down). If yes, which topics have decreasing factors.

Anything that is not prohibited by the legislation of the Russian Federation and the rules of the site is possible. Anything that is forbidden is impossible.

15. A case from our practice. Every day we publish unique articles on the blog, and several pubs of VKontakte immediately parset them. It turns out that they announce this content first, and we - after 5-10 minutes. Does this negatively affect the coverage of our records?

Yes, ideally, you will be the first to upload content to the social network, then we will define you as the source and your content will be promoted in the "Recommendations" section for new users.

16. That is, the original source is the one who first share the link? If this is the case, we cannot compete with autoparsers, as this happens automatically for them, and we manually write a unique announcement. What should we do in this case?

I did not share the link, but I downloaded the content natively. Embeds and links in principle do not make you the primary source of content. What is there on someone else's video hosting or website, we do not know, and this content is not parsed and not analyzed. Therefore, it is important to load your materials into the native means of the platform feed.

17. How will articles created with the help of an editor be indexed? Wiki articles got into the search well, and what will happen now?

Articles are well indexed by “Yandex” from the very start, and by Google too.

18. Public LIVE is the employee "VKontakte"? Can I refer to it as the official source of information?

At one's own risk. This is an unofficial community.

How does the algorithm "Prometheus"

19. Are live people involved in the work of Prometheus? Or is the solution completely given to the algorithm? How does he study? Can you tell in simple words?

The algorithm is automated and independently searches for new authors. Neuronka sometimes makes mistakes. We mark every such error by sending the network a signal about the wrong choice. Based on this feedback, Prometheus is learning.

20. We have long been posting high-quality content that evokes a response from our audience, but we haven’t received the fire of Prometheus. How do we influence this? Can you give specific recommendations? I just want specifics, not vague tips like "post the content that your audience likes."

First of all, the Prometey algorithm responds to original materials submitted in native VKontakte tools. Native content delivery formats are extremely important because they allow the algorithm to evaluate the uniqueness of the content and its subject matter. The first is important in order to get a label, the second is to provide the author with a “fire” a good boost of coverage. Entries of such authors fall into the "Recommendations" section.

Virtually all of your community posts, now, are 1-2 liner phrases and a link / snippet leading to the site. In order for “Prometheus” to notice a public, try to increase the number of natively submitted materials. Post some texts in the editor of articles, upload videos to the VC player or at least increase the length of the connections :-)

Now the algorithm has nothing to catch on. He sees short phrases + links and cannot “evaluate” the uniqueness of the materials.

21. Andrey Zakonov, director of growth and research on the social network VKontakte, in one of his interviews He said that your algorithms consider unique content even one that is copied from another source, but supplemented with some conclusions, or a good liner is just made to it. Do you think this is fair to the original author? After all, he did the main work, and the coverage gets another.

This does not mean that the source does not receive coverage. Consider the situation on an example: the artist posted on the page a photo of his new work. A well-known critic wrote a review on a picture, attaching an image to the recording. In this case, both the first and second are the authors and creators of the unique content. Coverage will get both.

User Generated Content is one of the strengths of social networks. If someone has created interesting content, then it begins to diverge on the network. Users discuss, criticize, complement, comment and share with each other. This is normal. It is clear that for a banal repost there will be no boosts, but for a good added value - may be. Such is postmodernism on the march.

22. Are there any topics that, no matter how hard they try, will never receive the fire of "Prometheus"?

Pornography, erotica, shock content, "junk" advertising communities are built into the exceptions of the algorithm and cannot receive the label of fire. In general, they are not good for page or community reach. The rest is no limit.

We covered the issue of the topic in the article "How the" smart "tape and the algorithm of talent search" Prometheus "on VKontakte work."

A user or community need not be directly related to creativity in order to get a fire sign:

"The algorithm has no preferences and favorite topics. Whether it is a famous musician with a million fans or IKEA furniture tester. If a person talks about work and life with excitement, he is the author. That means" Prometheus "will sooner or later mark it."

23. Is it true that the large coverage provided by Prometheus, in most cases, is not converted into subscribers, likes and repost? They say this is due to the fact that it is not quite working correctly yet - it shows posts of an irrelevant audience. Is it so? And if so, do you plan to fix it?

And we answered this question in the article where we analyzed specific examples of completely different communities. The increase in new subscribers depends on the author himself, his activity and the quality of the published content. Someone in the week of Prometheus had 80,000 new subscribers, and someone had a couple hundred. Large coverage is great, but if the author cannot interest new users, the number of subscribers will not increase.

"Prometheus" demonstrates the materials of the community to the audience, which the subject of the author may be interesting. Of course, not everyone who is in the Modern Art community will like the works of a particular artist, whom Prometheus noted. Units can even go to the comments and express their "fe". Observing such a single negative, the authors may have a wrong idea that the algorithm does not take into account the interesting users and shows entries to everyone in a row. This is not true. With negative comments it should be remembered that this is only a small percentage of huge coverage.

Algorithms also do not stand still. We update models and make audience search more accurate. The more accurate the audience is chosen, the sooner the number of subscribers grows. But mostly - it's up to the author. Interact, engage, convert coverage, which gives "Prometheus" in subscribers.

