Academia|Photos want to be still but AI doesn't, MIT black tech turns images into small videos in seconds

Have you ever thought about it? When you are shown any photo, you may see not just a still image, but a moving "mini-video". Today, with the help of machine learning, it is possible to predict the next sequence of actions based on a still photo, with a fairly high accuracy rate.

Whether it's a beautiful woman riding a bike, a dog catching a Frisbee, or someone falling suddenly, etc., imagining these continuous actions is one of our most basic skills, and we don't need to consider the vast amount of information used to predict them, such as gravity, inertia, and the instinctive response to a fall. Then, to get the computer to learn this ability to anticipate is certainly a key challenge in machine vision.

Researchers from the Massachusetts Institute of Technology are working to solve this problem, and they have shown a series of very impressive results. By using specially trained neural network , converting the image into a video and having the computer predict what will happen next. However, their model still has many limitations; the videos are usually only a few seconds long, the files are small, and the images are often messy. But it's still an impressive creation in machine imagination, and computers have taken another step forward in understanding the world as humans do.

Train this. neural network The use of more than 200 10,000 from Flickr Downloaded video clips。 All scenes are divided into four types: golf course、 beaches、 Train stations and hospitals。 That's a solid set of continuous shots., Eliminates camera shake。 With these data, Team neural network Not only can you generate short videos of scenes like these, Can also produce a continuous picture from a still image。 This is essentially a prediction of the action that will happen next, However, the current effect is still very limited, One can only speculate on the pixel change, Instead of an understanding based on the whole scenario。

Here are the results.

Here, we can see the effect of implementation, for example, on the beach, you can see the ups and downs of the waves, and in the train station, the prediction model predicts the train's journey. However, when asked to predict how someone will cross the golf course, the results look distorted and the image is blurry.

The researchers mention that the computer's predictions often don't follow normal logic, but at least its judgments about the trajectory of the movement are reasonable.

Machine learning systems have made many advances in related areas, including predicting behaviors such as handshakes and hugs, and even being able to generate audio to match videos. Yann LeCun, head of Facebook's AI division, touched on the topic in an interview last year, saying that predicting motion trajectories is an important part of developing predictive computers. However, it will take more effort on the part of the researcher to achieve a true understanding of the video or image, and its possible next action.

"Suppose you're watching a Hitchcock movie, at which point I ask, 'What will the plot of the movie be like in 15 minutes from now?' You would have to try to anticipate who the killer is at this point. "

LeCun says, "To fully solve this problem requires understanding the world and human nature, and that's where the real fun is. "

Artificial intelligence has become more and more capable at prediction, but more sophisticated models are needed to achieve more accurate, natural and realistic results. Researchers may need to consider more factors, build more complex neural networks, and train models using more data sets. Only then is it possible to truly anticipate continuous movements in an image in advance through machine learning techniques.

via The Verge

1、Python overtakes R as the most used language for data science and machine learning
2、09 Vuejs Quick Start Getting Started with Vuex HandsOn
3、TestNG official documentation in Chinese 2annotation
4、Fixing NSTimer failure on UIScrollView scrolling in iOS development
5、Why are the friends you send folded Dawn Tips

    已推荐到看一看 和朋友分享想法
    最多200字,当前共 发送