• svg]:stroke-accent-900">

Inside Facebook’s artificial intelligence lab

By Dave Gershgorn

Posted on Sep 22, 2015 5:00 PM EDT

15 minute read

It’s time to stop thinking about Facebook as just a social media company. Between its efforts to deliver internet service with drones, buying Oculus for virtual reality, and its continued pursuit of artificial intelligence, Facebook has quickly become one of the most advanced technology research centers in the world.

It’s not alone: companies like Google and even IBM have similar schemes, and collectively, the developments across the field have accelerated to the point that artificial intelligences will surely shape the way humans interact with computers. In fact, they already do — but quietly, behind the curtains. Facebook has great interest in this technology, servicing 1.5 billion users monthly. The company tackles the problem of emulating general intelligence — that is, getting computers to think less like linear, logical machines, and like us free-form humans — with a multi-prong approach. While the Facebook Artificial Intelligence Research (FAIR) team works on solving generalized AI problems, smaller groups like Language Technology and Facebook M deploy practical features to users.

The birth of artificial intelligence research at Facebook

It all started in 2013. Facebook founder and CEO Mark Zuckerberg, chief technology officer Mike Schroepfer, and other company leadership were taking stock in the company’s accomplishments since launching almost a decade before, and looking to see what would allow them to thrive throughout the next 10 or 20 years.

Facebook had already been using machine learning on its hugely popular social network to decide what users would see on their News Feeds, but it was simple compared to the cutting-edge neural networks of the time.

Some Facebook engineers had also been experimenting with convolutional neural networks (CNNs), a powerful flavor of machine learning that is now popularly used for identifying images. Zuckerberg was impressed by the potential of artificial intelligence, even in its early stages, so he hired an engineer out of Google Brain, Marc’Aurelio Ranzato. Then, he went to the source: the inventor of CNNs, Yann LeCun.

Yann LeCun, who now serves as the director of FAIR, comes from a storied tenure of artificial intelligence research. He began his work in Bell Labs (founded by telephone father Alexander Graham Bell, and known for its experiments across myriad fields in telecommunications and technology) as a researcher starting in 1988, then moving to become a department head at AT&T Labs until developing 2003, when he began to teach at New York University. The modern convolutional neural network is a culmination of work throughout LeCun’s career. Ever wonder how an ATM can read your check? That was LeCun , whose early work included a neural network simulator called “ SN ” and deployed in 1996.

“I started talking with Schroepfer and Mark, and I guess they liked what I told them,” LeCun said in an interview with Popular Science. “And then they tried to convince me to run it…When someone like Mark comes to you and says ‘Oh, okay, you pretty much have carte blanche. You can put together a world-class research lab and I expect you to build the best research lab in AI in the world.’ I’ll say,’Hmm, interesting challenge.’”

Yann had some ideas about what that world-class research lab would entail. Like if you want to attract top talent, you have to have an ambitious research lab, with ambitious long-term goals. Then you give people some freedom on their work, and you have to be very open about your research. “It lined up with sort of the philosophy at Facebook, which is a philosophy of openness,” LeCun said.

Assembling The Team

The team subsequently tasked with creating the future of Facebook is a small, only about 30 research scientists and 15 engineers in total. Labor is divided over three branches: Facebook AI Research’s main office is in New York City’s Astor Place, where LeCun operates with a team of about 20 engineers and researchers. A similar number staffs the Menlo Park branch, and as of June, FAIR has opened a smaller Paris office of about 5 to collaborate with INRIA, the French Institute for Research in Computer Science and Automation. There are others that work within Facebook on AI deployment, like the Language Technology team; FAIR is the research arm.

These researchers and engineers come from all over the tech industry, and many have previously collaborated with LeCun. High-level artificial intelligence research isn’t an enormous field, and many of LeCun’s pupils have gone on to seed AI startups, which would be absorbed into larger companies like Twitter.

LeCun once told Wired that deep learning “is really a conspiracy between Geoff Hinton and myself and Yoshua Bengio, from the University of Montreal.” While Hinton works on AI at Google, and Bengio splits time between University of Montreal and data mining company ApStat, LeCun has been able to snag other top-shelf names.

“When I was first made a a department head at Bell Labs, my boss told me, “There’s only two things you need to remember: First, of all, never put yourself in competition with people in your group. Second, only hire people who are smarter than you,’” LeCun said.

Leon Bottou, who leads the research sub-group concerned with language, has been a longtime colleague of LeCun. They developed neural network simulators together, beginning in 1987 with AmigaOS. Bottou joined FAIR in March 2015, previously working for Microsoft Research while exploring machine learning and machine reasoning .

LeCun also brought Vladimir Vapnik onto the team as a consultant in November 2014; Vapnik and LeCun worked together at Bell Labs, publishing formative research on machine learning, including a technique to measure machine learning capacity . Vapnik is the father of statistical learning theory , which addresses the aspect of prediction based on established data. Prediction, which seems like a simple task for a human, actually draws on an immense library of preconceived notions and observations of the world. (But more on that later.) Vapnik, a leader in this field, continues his work with an interest in knowledge propagation, applying cues from teacher-student interaction to machine learning.

The size and academic weight of the team allows Facebook to be ambitious with their long-term goal, which doesn’t fall short of a system that LeCun would call “unambiguously intelligent.”

“Right now, even the best AI systems are dumb, in the way that they don’t have common sense,” LeCun said. He talks about a situation where I pick up a bottle, and leave the room. (We’re in a FB NYC conference room called Gozer the Gozerian — sharing the name of the Ghostbusters villain — an ominous name for a room to discuss the birth of true machine intelligence.) The human brain has no trouble imagining the entire simple scenario of someone picking up a bottle and leaving a room, but to a machine, huge swaths of information are missing based on that premise alone.

Yann says that as I imagined the situation in my mind, “You probably stood up, even though I didn’t say that in the sentence, you probably walked. You opened the door, you walked through the door, you closed the door maybe. The bottle is not in the room. I mean there are a lot of things you can deduce from that because you know the constraints of the real world. So I don’t have to tell you all those facts.”

The artificial intelligence community doesn’t know enough right now about the how machines learn to bring this level of inference. Stepping to achieve that goal, Facebook is focusing on building machines that can learn well enough to understand the world around them.

The biggest barrier, says LeCun, is what’s called “unsupervised learning.” Right now machines mainly learn in one or two ways: supervised learning, where the system is shown thousands of pictures of dogs, until it understands the attributes of a dog. This method is explained in Google’s DeepDream, where researchers reversed the process to reveal its efficacy.

The other is reinforcement learning, when the computer is shown information to identify, and is only given a “yes” or “no” answer on each decision it makes. This takes longer, but the machine is forced to make internal configurations, and can yield robust results when the two learning forms are married. ( Remember DeepMind playing Atari? ) Unsupervised learning requires no feedback or input. It’s how humans learn, LeCun says. We observe, draw inferences, and add them to our bank of knowledge. That’s proven to be a tough nut to crack.

“We don’t even have a basic principle on which to build this. We’re working on it, obviously,” LeCun says, and laughs. “We have lots of ideas, they just don’t work that well.”

Early Progress Toward A Truly Intelligent AI

But that’s not to say that there hasn’t been progress made. Right now, LeCun is excited about work on a “memory” network that can be integrated into present convolutional neural networks, giving them the ability to retain information. He likens the new mode of memory retention to short term and long term memory in the brain, governed by the hippocampus and cerebral cortex respectively. (LeCun actually detests CNNs being compared to brains , instead preferring a model of a black box with 500 million knobs.)

The memory module allows researchers to tell the network a story, and then have it answer questions about the story later.

For the story, they used J.R.R. Tolkein’s Lord of the Rings Well, not the entire book, but short summaries of major plot points. (“Bilbo took the ring.”) When asked questions about where the ring was at certain points in the story, the AI would be able to answer in short, correct answers. This means it “understands” relationships between objects and time, according to CTO Mike Schroepfer, who stressed this technology’s ability to help Facebook show you what you want to see with higher accuracy.

“By building systems that understand the context of the world, understand what it is you want, we can help you there,” Schroepfer said at a developer presentation in March . “We can build systems that make sure all of us spend time on the things we care about.”

The FAIR team is developing this context around a project called “Embed the World.” To help machines better understand reality, the FAIR team is teaching them to represent the relationships between everything in vectors: images, posts, comments, photos, and video. The neural network is creating an intricate web of content that groups like pieces of media, and distances different ones. There’s a helpful video to visualize this:

With this system, LeCun says that we can start to “replace reasoning with algebra.” And it’s incredibly powerful. The artificial neural networks developed in the Embed the World project can link two photos that were taken in the same location based on visual similarities in the photos, but also figure out if text describes the scene. It’s recreating a virtual memory of reality, and clustering it in the context of other places and events. It can even “virtually represent a person,” based on their previous likes, interests, and digital experiences. This is somewhat experimental, but has great implications for Facebook’s News Feed and is used in a limited way to track hashtags.

There’s a lot of talk about long-term goals, but small victories along the way have made Facebook incrementally smarter. In June 2014, they published an article titled “DeepFace: Closing the Gap to Human-Level Performance in Face Verification,” which claimed more than 97 percent accuracy in recognizing faces. LeCun says that he’s confident Facebook’s facial recognition is the best in the world, and that it’s a key difference between Facebook and academic research institutions. Now, DeepFace is driving force behind Facebook’s automatic photo tagging.

“If we have an idea that actually works, within a month it can be in front of 1.5 billion people,” LeCun said, “Lets keep our eyes focused on the horizon, where our long-term goal is, but on the way there are a lot of things that we’re going to build that are going to have applications in the short term.”

Rob Fergus, a veteran of NYU and MIT’s Computer Science and Artificial Intelligence Lab, leads the AI research team concerned with vision. His team’s work that can already been seen in the automatic tagging of photos, but Fergus says the next step is video. Lots of video is “lost” in the noise because of a lack of metadata, or it’s not accompanied by any descriptive text. AI would “watch” the video, and be able to classify video arbitrarily.

This has major implications for stopping content Facebook doesn’t want from getting onto their servers—like pornography, copyrighted content, or anything else that violates their terms of service. It also could identify news events, and curate different types of video category. Facebook has traditionally farmed these tasks out to contracted companies, so this could potentially play a role in mitigating costs.

In current tests, the AI shows promise. When shown a video of sports being played, like hockey, basketball or table tennis, it can correctly identify the sport. It can tell baseball from softball, rafting from kayaking, and basketball from street ball.

The AI Behind Facebook

A separate group within Facebook, called Language Technology, focuses on developing translation, speech recognition, and natural language understanding. FAIR, LeCun’s realm, is the research arm of Facebook’s AI push, and Language Technology (under the umbrella of Applied Machine Learning) is one of the places that actually deploys the software.

They collaborate with FAIR, but stand alone in their development and deployment, and their work has developed 493 active-used translation directions (English to French and French to English count as two directions).

With Facebook’s creed to make the world more open and connected, language services is a natural route. More than half of users don’t speak English, but English makes up most of the content of Facebook, says Language Technology head Alan Packer.

There are 330 million people using these translation services, which are most often accessed by clicking the “See Translation” button. If you’ve been the first person to click the translation button, congratulations, you’ve operated artificial intelligence. The first click initiates the translation request to the server, which is then cached for other users. Packer says that Shakira’s posts are translated almost instantly. The team is also rolling out native translation of content, which will display a “See the original” button.

Artificial intelligence is necessary in this role because “dumb” translation is ineffective in relating how humans interact with each other. It generates improper syntax, misunderstands idioms, and has no reference for slang. This is a flaw with direct, word-to-word translation like the Google Translate of yore.

Packer says that figures of speech are particularly difficult, but something an AI that understands underlying semantic meaning would catch.

“The phrase ‘hot dog,’ if you just translate those words literally into French, it’s not going to work. ‘Chaud chien’ means nothing to a French person,” Packer said. “And then if you have a picture of me skiing and I say, ‘I’m hot dogging it today,’ that turns out to be really hard to learn, that hot dogging means showing off.”

This understanding isn’t at scale yet, but early results are promising that it’s not an insurmountable task. Packer says that the trick isn’t understanding metaphors or idioms, but realizing when not to understand them as well.

The AI is adaptive be nature, and can be trained on slang quickly. The Language Technology team recently learned that French soccer fans were using a new form of slang to say “wow,” and after training the neural network on that public data, it can now reliably translate that text. They’re working now to grow Facebook’s lexicon by training on new data every day, but all languages are now updated monthly.

We’re used to digital personal assistants by now, like Siri, Cortana, and Google Now. But Facebook took a different approach with its new AI personal assistant, M, which offers is the ability to execute complex tasks outside of the confines of your phone. Siri can send a text, but M can book a flight and make travel plans. During the development process, a Facebook employee even got M to schedule a series of in-home appraisals with moving companies. (You can’t buy tobacco, alcohol, escorts, or guns with M, though.)

The backbone of Facebook M actually comes from a startup acquired earlier this year, Wit.ai. They joined the Messenger team under VP David Marcus, and earlier this month debuted M .

Alex LeBrun, who leads the Wit.ai team within Facebook, says that artificial intelligence not only makes M better for accomplishing generalized tasks, but also for cases with very special exceptions, like traveling with an infant or during blackout dates. It also means that as AI grows, so does M’s capabilities. He’s hopeful that in even three years, M will be able to call the cable company or DMV and wait on hold for users.

“The true added value of a service like M is to be able to fulfill your request even if it’s a little bit specific or weird,” LeBrun says, ”It will do it even if it’s complex and not the mainstream case.”

And M learns as it goes along. Right now, it’s not robust enough to stand alone. A team of “AI trainers” works with the program, and if there’s a request that M doesn’t understand the trainers take over. M then learns from what the human trainer does, and can use that technique with later requests. There’s also an element of randomness built into the program, LeBrun says, to bring it closer to human learning. This means that it will sometimes try to find novel, more efficient ways to do a common task.

“AI trainer” is a new position, and one that even Facebook is still trying to figure out. They do say, however, that it’s not a job for researchers and engineers, but instead more geared for people with customer service experience. As time goes on, Facebook will be able to evaluate how many requests require human interference, but the eventual hope is that humans won’t be needed at all in the future.

These are essential to the development process, though, because their job is twofold: serve as a the last line of defense for quality control, and teach the AI.

And with human intelligence as the gatekeeper, M can be used as a sandbox for FAIR’s development. “As soon as they have something to test, it will surface in M, because with our training and supervision, it’s really risk-free,” LeBrun says.

The M platform is built entirely on Wit.ai’s platform (mainly developed before Facebook), but FAIR also will be using the deep learning data gathered from users interacting with the personal assistant AI.

Facebook In The Community

“The research we do, we’re doing it in the open. Pretty much everything we do is published, a lot of the code we write is open-sourced,” LeCun says. Those publications are available on Facebook’s research site , and also ArXiv, a library of research papers in computer science, mathematics, and physics.

This goes for a lot of the artificial intelligence community. LeCun has been a leading figure in developing Torch , a C++ library for AI development. Along with the rest of the team at Facebook, he works with researchers at Twitter and Google’s DeepMind to make Torch a better tool for everyone. (Many of these experts now in the field were once students of LeCun, as well.)

Anything else they might publish, from work that could be integrated in medical imaging or self-driving cars, is open to be used to further the field, LeCun says. The work that Facebook does is important to Facebook users, but at its core the research team strides towards furthering humanity’s collective knowledge of how to better emulate intelligence with machinery.

This is why Facebook is an important part of the artificial intelligence community, and why the community itself is so important.

“The scenario you seen in a Hollywood movie, in which some isolated guy in Alaska comes up with a fully-functional AI system that nobody else is anywhere close to is completely impossible,” LeCun said, “This is one of the biggest, most complicated scientific challenges of our time, and not any single entity, even a big company can solve it by itself. It has to be a collaborative effort between the entire research and development community.”

Latest in AI

How to spot an ai-generated video how to spot an ai-generated video.

By David Nield

Lab-grown human brain tissue used to control robot Lab-grown human brain tissue used to control robot

By Andrew Paul

Opinionated and open machine learning: The nuances of using Facebook's PyTorch

georgios-anadiotis-author.jpg

  • The best early Amazon Prime Day 2024 deals
  • If Intel can't come up with a Qualcomm-killer soon, it's game over for x86 PCs
  • I bought the cheapest Surface Pro Copilot+ PC - here are my 3 takeaways as a Windows expert
  • AI tools could ease caseload of therapists feeling burnt out

The release of PyTorch 1.0 beta was part of the big news in last week's machine learning (ML) October fest, along with fast.ai , Neuton , and MLFlow . With AI being what it is today, and machine learning powering a good deal of what is going on there, such news cause ripples beyond the ML community.

Also: Facebook open-source AI framework PyTorch 1.0 released

At last week's Spark AI Summit Europe , we had the chance to discuss with some of the rock stars of this community. MLFlow's new version was presented in Databricks Chief Technologist Matei Zaharia's keynote, and PyTorch 1.0 was presented in Facebook AI Research Engineer Soumith Chintala's keynote .

Chintala is the creator and project lead for PyTorch, one of the top machine learning frameworks. After his keynote, ZDNet caught up with Chintala on a number of topics, ranging from Facebook's motivation and strategy for PyTorch, to the specifics of using it.

How many machine learning frameworks does the world need?

It may sound like a trivial question to ask, but we felt we had to get it out of the way: What was the thinking behind Facebook getting involved and investing resources in its own ML framework with PyTorch? Especially considering that there is another ML framework supported by Facebook, Caffe2.

For cloud vendors like AWS, Google, or Microsoft, there is a very clear incentive : The more people use their ML frameworks, the more compute and storage resources will eventually gravitate toward their respective clouds. But what stakes does Facebook have in this game? Why dedicate the resources -- human and otherwise -- needed to develop and maintain not one, but two ML frameworks, and where is this going?

Also: Fast.ai's software could radically democratize AI

The thinking behind this was not as forward as you may think, said Chintala. He and the PyTorch team set out to build this simply because they are opinionated and wanted something cut out to their needs:

"Google's TensorFlow was released in 2015. We tried using it, but were not super happy with it. Before this, we tried Caffe1, Theano, and Torch. At the time, we were using Torch and Caffe1 for research and production. The field has changed a lot, and we felt a new tool was needed. Looked like nobody else was building it, not the way we thought will be needed in the future. So, we felt we should build it. We showed it to some people, and they liked it. There is a strong open source culture in Facebook, so we open sourced it, and it took off. But the goal was mostly to make ourselves happy. It was not because we did not want to rely on Google or Microsoft."

Soumith Chintala is a Research Engineer at Facebook AI Research and the creator of PyTorch. His motivation for creating it? Having something that works according to his needs.

Chintala now works full time on PyTorch, and his team includes something between 10 and 15 people. Even though intellectual curiosity and opinions may account for taking the step to create PyTorch, it does not explain why Facebook would assign these people to work on PyTorch in the long run. Or does it?

Also: Startup uses AI and machine learning for real-time background checks

Chintala's take is that some people would have to be assigned on something like this anyway. If PyTorch had not been created, the other option would be to tweak some existing framework, which would end up requiring the same resources too. But then, what about Caffe2? Why maintain 2 ML frameworks?

Machine Learning in Facebook AI Research and in production

First off, PyTorch is now officially the one Facebook ML framework to rule them all. PyTorch 1.0 marks the unification of PyTorch and Caffe2. Going forward, Chintala explained, the choice made was to use PyTorch for the front end, and Caffe2 for the back end. This means that nothing changes for users of previous versions of PyTorch, but Caffe2 front end will be deprecated and people will have to switch to PyTorch.

Also: Facebook advances computer vision using hashtagged pictures

This has to do with the philosophy and goal of each framework. PyTorch was addressed to researchers who need flexibility. Caffe2 was aimed at running in production at extreme scale -- something like 300 trillion inferences per day, as Chintala noted. As you can imagine, merging the two was no easy feat:

"It's not just merging two codebases, but two projects and philosophies. The hardest part was still the technical one, but culture was a close second. It was mostly about how the frameworks evolve, and what needs they are addressing. At production scale, code has different needs and properties then when you build for research, where you want everyone to be able to express their ideas and be as creative as possible," Chintala said.

Facebook AI Research has more than 100 researchers working on various projects. Open source and publishing is key to its philosophy, according to Chintala.

This distinction was also pronounced when discussing the infrastructure people at Facebook AI Research (FAIR) use. FAIR employs more than 100 people. Chintala noted that you can see people doing research on their laptops, and he personally uses local disk for storage a lot, as it makes it easier for him to work with files. But it really depends on the project.

"We also use Hive, Presto, and Spark. Some projects really push the limits of scale, and for those we use organizational infrastructure," Chintala said.

Also: 10 ways AI will impact the enterprise in 2018 TechRepublic

Another thing that FAIR does a lot, according to Chintala, is publish. Not just code, but also research papers and data:

"In fundamental research, you have to work with peers in the community, otherwise you get siloed. You may think you are working on something awesome, and then publish it five years later and realize it is horrible. We focus on open datasets, and publish right away. And we also leverage graph structures , to some extent. For example, we use Wikipedia for question answering. Its structure is graph-like, and there also is the structured version, DBpedia. We do a lot of research on dialog and question answering. For these we use text datasets, and we also synthesize our own. In vision, we use large scale vision datasets. Another example is how we use hashtags to enhance translation. We may have images with many hashtags describing the same thing in different languages, and we embed images and hashtags in a graph structure and then work on the translation. Although we do such things a lot, I don't remember having worked with Facebook's social graph."

Low level, high level, and opinionated openness

On Spark AI Summit's stage, Chintala showed many of the specifics of working with PyTorch. What impressed us was that, to the untrained eye, many of the code fragments that Chintala used seemed quite raw and low-level. This, however, is intentional, and there are higher level shortcuts as well, as Chintala explained.

Let's take building a neural network, for example. In PyTorch, this is done via a function that looks a bit messy. But according to Chintala, this is what people want:

"This is one of the biggest reasons why people use PyTorch. This function describes a neural network. It may not be well-structured, but it's where people get their expressiveness from. Our target audience is well-acquainted with this, and they want to use it this way. Let's say you want to build a recurrent neural network, and you have some time series you need to use. In other frameworks you need to use an API, construct a time series, etc. In PyTorch, you just use a for loop. Users find this more intuitive, because there is no extra step needed - you just write code."

Not everything has to be low-level, however. Chintala pointed out that for state of the art models, such as ResNet50 , there are one-liners that encapsulate them and can be used in the code. PyTorch also comes with an array of pre-trained models ("model zoo"), out-of-the-box distributed capabilities, and integration with probabilistic programming, machine translation, natural language processing, and more.

Also: AI means a lifetime of training CNET

Occasionally, these can look deceptively simple. Could this be a problem? For example, when showcasing PyTorch's abstraction for distributed deep learning, it was hard to believe all the nitty-gritty details can be taken care of by one line of code: Where does the dataset come from, which node gets each part, and so on.

In this case, Chintala explained, users can intervene at a lower level and fine tune data loaders, for example. But the idea here was that in 90 percent of cases there is a structured, well-formed pattern of how most people do distributed deep learning, and the one-liner is built to leverage this. And it seems to work well, considering the near-perfect linear scaling in the graph Chintala shared.

This is how you create a neural network in PyTorch. It may seem low-level to you, but that's exactly how its creators wanted it to be.

So, despite the fact that PyTorch is opinionated, it looks like its creators tried to strike a balance with the ability to accommodate different usage patterns, at least to some extent. As Chintala noted, one of the goals was to make PyTorch a package anyone in the Python ecosystem can use, regardless of what they may be using currently.

Also: How Facebook scales AI

The entire Python ecosystem can be used at will is PyTorch's promise. In fact, there is a mechanism called zero-memory copy in place to facilitate this:

"We don't have the 'not invented here' syndrome. You can take a PyTorch tensor and connect it to NumPy. It works by creating a NumPy struct in C, and then directly using a pointer on it. All you have to do is a very cheap, free almost, operation - query the C struct. In NumPy many many things are done in C, and we use it, too."

This may seem like a triviality, but it goes to show the thinking behind PyTorch. High level and low level, opinions, and openness intertwined, or a balancing act. In all fairness, much of open source development, research, and Facebook itself for that matter, walk similar fine lines. This may help the world benefit from cutting edge research in FAIR, as well as FAIR researchers keep up to date with the broader community.

36 of the best movies about AI, ranked

Previous and related coverage:.

What is AI? Everything you need to know

An executive guide to artificial intelligence, from machine learning and general AI to neural networks.

What is deep learning? Everything you need to know

The lowdown on deep learning: from how it relates to the wider field of machine learning through to how to get started with it.

What is machine learning? Everything you need to know

This guide explains what machine learning is, how it is related to artificial intelligence, how it works and why it matters.

What is cloud computing? Everything you need to know about

An introduction to cloud computing right from the basics up to IaaS and PaaS, hybrid, public, and private cloud.

Related stories:

  • There is no one role for AI or data science: this is a team effort
  • Startup Kindred brings sliver of hope for AI in robotics
  • AI: The view from the Chief Data Science Office
  • Salesforce intros Einstein Voice, an AI voice assistant for enterprises
  • It's not the jobs AI is destroying that bother me, it's the ones that are growing

If you want a career in AI, start with these 5 steps

We need bold minds to challenge ai, not lazy prompt writers, bank cio says, do ai tools make it easier to start a new business 5 factors to consider.

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications You must be signed in to change notification settings

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

facebookresearch/fairseq

Folders and files.

NameName
2,323 Commits
dependency_submitit_launcher dependency_submitit_launcher

Repository files navigation

facebook ai research

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks.

We provide reference implementations of various sequence modeling papers:

  • Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017)
  • Convolutional Sequence to Sequence Learning (Gehring et al., 2017)
  • Classical Structured Prediction Losses for Sequence to Sequence Learning (Edunov et al., 2018)
  • Hierarchical Neural Story Generation (Fan et al., 2018)
  • wav2vec: Unsupervised Pre-training for Speech Recognition (Schneider et al., 2019)
  • Pay Less Attention with Lightweight and Dynamic Convolutions (Wu et al., 2019)
  • Effective Approaches to Attention-based Neural Machine Translation (Luong et al., 2015)
  • Attention Is All You Need (Vaswani et al., 2017)
  • Scaling Neural Machine Translation (Ott et al., 2018)
  • Understanding Back-Translation at Scale (Edunov et al., 2018)
  • Adaptive Input Representations for Neural Language Modeling (Baevski and Auli, 2018)
  • Lexically constrained decoding with dynamic beam allocation (Post & Vilar, 2018)
  • Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context (Dai et al., 2019)
  • Adaptive Attention Span in Transformers (Sukhbaatar et al., 2019)
  • Mixture Models for Diverse Machine Translation: Tricks of the Trade (Shen et al., 2019)
  • RoBERTa: A Robustly Optimized BERT Pretraining Approach (Liu et al., 2019)
  • Facebook FAIR's WMT19 News Translation Task Submission (Ng et al., 2019)
  • Jointly Learning to Align and Translate with Transformer Models (Garg et al., 2019)
  • Multilingual Denoising Pre-training for Neural Machine Translation (Liu et at., 2020)
  • Neural Machine Translation with Byte-Level Subwords (Wang et al., 2020)
  • Unsupervised Quality Estimation for Neural Machine Translation (Fomicheva et al., 2020)
  • wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations (Baevski et al., 2020)
  • Generating Medical Reports from Patient-Doctor Conversations Using Sequence-to-Sequence Models (Enarvi et al., 2020)
  • Linformer: Self-Attention with Linear Complexity (Wang et al., 2020)
  • Cross-lingual Retrieval for Iterative Self-Supervised Training (Tran et al., 2020)
  • Deep Transformers with Latent Depth (Li et al., 2020)
  • Unsupervised Cross-lingual Representation Learning for Speech Recognition (Conneau et al., 2020)
  • Self-training and Pre-training are Complementary for Speech Recognition (Xu et al., 2020)
  • Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training (Hsu, et al., 2021)
  • Unsupervised Speech Recognition (Baevski, et al., 2021)
  • Simple and Effective Zero-shot Cross-lingual Phoneme Recognition (Xu et al., 2021)
  • VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding (Xu et. al., 2021)
  • VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding (Xu et. al., 2021)
  • NormFormer: Improved Transformer Pretraining with Extra Normalization (Shleifer et. al, 2021)
  • Non-Autoregressive Neural Machine Translation (Gu et al., 2017)
  • Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement (Lee et al. 2018)
  • Insertion Transformer: Flexible Sequence Generation via Insertion Operations (Stern et al. 2019)
  • Mask-Predict: Parallel Decoding of Conditional Masked Language Models (Ghazvininejad et al., 2019)
  • Levenshtein Transformer (Gu et al., 2019)
  • Better Fine-Tuning by Reducing Representational Collapse (Aghajanyan et al. 2020)

What's New:

  • May 2023 Released models for Scaling Speech Technology to 1,000+ Languages (Pratap, et al., 2023)
  • June 2022 Released code for wav2vec-U 2.0 from Towards End-to-end Unsupervised Speech Recognition (Liu, et al., 2022)
  • May 2022 Integration with xFormers
  • December 2021 Released Direct speech-to-speech translation code
  • October 2021 Released VideoCLIP and VLM models
  • October 2021 Released multilingual finetuned XLSR-53 model
  • September 2021 master branch renamed to main .
  • July 2021 Released DrNMT code
  • July 2021 Released Robust wav2vec 2.0 model
  • June 2021 Released XLMR-XL and XLMR-XXL models
  • May 2021 Released Unsupervised Speech Recognition code
  • March 2021 Added full parameter and optimizer state sharding + CPU offloading
  • February 2021 Added LASER training code
  • December 2020: Added Adaptive Attention Span code
  • December 2020: GottBERT model and code released
  • see documentation explaining how to use it for new and existing projects
  • November 2020: fairseq 0.10.0 released
  • October 2020: Added R3F/R4F (Better Fine-Tuning) code
  • October 2020: Deep Transformer with Latent Depth code released
  • October 2020: Added CRISS models and code
  • September 2020: Added Linformer code
  • September 2020: Added pointer-generator networks
  • August 2020: Added lexically constrained decoding
  • August 2020: wav2vec2 models and code released
  • July 2020: Unsupervised Quality Estimation code released
  • May 2020: Follow fairseq on Twitter
  • April 2020: Monotonic Multihead Attention code released
  • April 2020: Quant-Noise code released
  • April 2020: Initial model parallel support and 11B parameters unidirectional LM released
  • March 2020: Byte-level BPE code released
  • February 2020: mBART model and code released
  • February 2020: Added tutorial for back-translation
  • December 2019: fairseq 0.9.0 released
  • November 2019: VizSeq released (a visual analysis toolkit for evaluating fairseq models)
  • November 2019: CamemBERT model and code released
  • November 2019: BART model and code released
  • November 2019: XLM-R models and code released
  • September 2019: Nonautoregressive translation code released
  • August 2019: WMT'19 models released
  • July 2019: fairseq relicensed under MIT license
  • July 2019: RoBERTa models and code released
  • June 2019: wav2vec models and code released
  • multi-GPU training on one machine or across multiple machines (data and model parallel)
  • beam search
  • Diverse Beam Search ( Vijayakumar et al., 2016 )
  • sampling (unconstrained, top-k and top-p/nucleus)
  • lexically constrained decoding (Post & Vilar, 2018)
  • gradient accumulation enables training with large mini-batches even on a single GPU
  • mixed precision training (trains faster with less GPU memory on NVIDIA tensor cores )
  • extensible : easily register new models, criterions, tasks, optimizers and learning rate schedulers
  • flexible configuration based on Hydra allowing a combination of code, command-line and file based configuration
  • full parameter and optimizer state sharding
  • offloading parameters to CPU

We also provide pre-trained models for translation and language modeling with a convenient torch.hub interface:

See the PyTorch Hub tutorials for translation and RoBERTa for more examples.

Requirements and Installation

  • PyTorch version >= 1.10.0
  • Python version >= 3.8
  • For training new models, you'll also need an NVIDIA GPU and NCCL
  • To install fairseq and develop locally:
  • For faster training install NVIDIA's apex library:
  • For large datasets install PyArrow : pip install pyarrow
  • If you use Docker make sure to increase the shared memory size either with --ipc=host or --shm-size as command line options to nvidia-docker run .

Getting Started

The full documentation contains instructions for getting started, training new models and extending fairseq with new model types and tasks.

Pre-trained models and examples

We provide pre-trained models and pre-processed, binarized test sets for several tasks listed below, as well as example training and evaluation commands.

  • Translation : convolutional and transformer models are available
  • Language Modeling : convolutional and transformer models are available

We also have more detailed READMEs to reproduce results from specific papers:

  • XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale (Babu et al., 2021)
  • Training with Quantization Noise for Extreme Model Compression ({Fan*, Stock*} et al., 2020)
  • Reducing Transformer Depth on Demand with Structured Dropout (Fan et al., 2019)

Join the fairseq community

  • Twitter: https://twitter.com/fairseq
  • Facebook page: https://www.facebook.com/groups/fairseq.users
  • Google group: https://groups.google.com/forum/#!forum/fairseq-users

fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.

Please cite as:

Code of conduct

Security policy, releases 16, used by 3.2k.

@Anthonyxd22

Contributors 310

@myleott

  • Python 98.2%

Meta drops AI bombshell: Multi-token prediction models now open for research

  • Share on Facebook
  • Share on LinkedIn

We want to hear from you! Take our quick AI survey and share your insights on the current state of AI, how you’re implementing it, and what you expect to see in the future. Learn More

Meta has thrown down the gauntlet in the race for more efficient artificial intelligence. The tech giant released pre-trained models on Wednesday that leverage a novel multi-token prediction approach, potentially changing how large language models (LLMs) are developed and deployed.

In April we published a paper on a new training approach for better & faster LLMs using multi-token prediction. To enable further exploration by researchers, we’ve released pre-trained models for code completion using this approach on @HuggingFace ⬇️ https://t.co/OnUsGcDpYx — AI at Meta (@AIatMeta) July 3, 2024

This new technique, first outlined in a Meta research paper in April , breaks from the traditional method of training LLMs to predict just the next word in a sequence. Instead, Meta’s approach tasks models with forecasting multiple future words simultaneously, promising enhanced performance and drastically reduced training times.

The implications of this breakthrough could be far-reaching. As AI models balloon in size and complexity, their voracious appetite for computational power has raised concerns about cost and environmental impact. Meta’s multi-token prediction method might offer a way to curb this trend, making advanced AI more accessible and sustainable.

Democratizing AI: The promise and perils of efficient language models

The potential of this new approach extends beyond mere efficiency gains. By predicting multiple tokens at once, these models may develop a more nuanced understanding of language structure and context. This could lead to improvements in tasks ranging from code generation to creative writing, potentially bridging the gap between AI and human-level language understanding.

Countdown to VB Transform 2024

Join enterprise leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with peers, explore the opportunities and challenges of Generative AI, and learn how to integrate AI applications into your industry. Register Now

However, the democratization of such powerful AI tools is a double-edged sword. While it could level the playing field for researchers and smaller companies, it also lowers the barrier for potential misuse . The AI community now faces the challenge of developing robust ethical frameworks and security measures that can keep pace with these rapid technological advancements.

Meta’s decision to release these models under a non-commercial research license on Hugging Face, a popular platform for AI researchers, aligns with the company’s stated commitment to open science. But it’s also a strategic move in the increasingly competitive AI landscape, where openness can lead to faster innovation and talent acquisition.

The initial release focuses on code completion tasks, a choice that reflects the growing market for AI-assisted programming tools. As software development becomes increasingly intertwined with AI, Meta’s contribution could accelerate the trend towards human-AI collaborative coding.

The AI arms race heats up: Meta’s strategic play in the tech battlefield

However, the release isn’t without controversy. Critics argue that more efficient AI models could exacerbate existing concerns about AI-generated misinformation and cyber threats. Meta has attempted to address these issues by emphasizing the research-only nature of the license, but questions remain about how effectively such restrictions can be enforced.

The multi-token prediction models are part of a larger suite of AI research artifacts released by Meta , including advancements in image-to-text generation and AI-generated speech detection. This comprehensive approach suggests that Meta is positioning itself as a leader across multiple AI domains, not just in language models.

As the dust settles on this announcement, the AI community is left to grapple with its implications. Will multi-token prediction become the new standard in LLM development? Can it deliver on its promises of efficiency without compromising on quality? And how will it shape the broader landscape of AI research and application?

The researchers themselves acknowledge the potential impact of their work, stating in the paper : “Our approach improves model capabilities and training efficiency while allowing for faster speeds.” This bold claim sets the stage for a new phase of AI development, where efficiency and capability go hand in hand.

One thing is clear: Meta’s latest move has added fuel to the already blazing AI arms race. As researchers and developers dive into these new models, the next chapter in the story of artificial intelligence is being written in real-time.

Stay in the know! Get the latest news in your inbox daily

By subscribing, you agree to VentureBeat's Terms of Service.

Thanks for subscribing. Check out more VB newsletters here .

An error occured.

University of Missouri

Show Me Mizzou. News from the University of Missouri

Hackers beware: Research shows AI can assist with cybersecurity

A Mizzou researcher and collaborators found that leading chatbots can pass certified ethical hacking exams.

Cyber graphic

July 9, 2024 Contact: Janese Heavin, [email protected]

Chatbots powered by artificial intelligence (AI) can pass a cybersecurity exam, but don’t rely on them for complete protection.

That’s the conclusion of a recent paper co-authored by University of Missouri researcher Prasad Calyam and collaborators from Amrita University in India. The team tested two leading generative AI tools — OpenAI’s ChatGPT and Google’s Bard — using a standard certified ethical hacking exam.

Certified Ethical Hackers are cybersecurity professionals who use the same tricks and tools as malicious hackers to find and fix security flaws. Ethical hacking exams measure a person’s knowledge of different types of attacks, how to protect systems and how to respond to security breaches.

Prasad Calyam

ChatGPT and Bard, now Gemini, are advanced AI programs called large language models. They generate human-like text using networks with billions of parameters that allow them to answer questions and create content.

In the study, Calyam and team tested the bots with standard questions from a validated certified ethical hacking exam. For example, they challenged the AI tools to explain a man-in-the-middle attack — an attack in which a third party intercepts communication between two systems. Both were able to explain the attack and suggested security measures on how to prevent it.

Overall, Bard slightly outperformed ChatGPT in terms of accuracy while ChatGPT exhibited better responses in terms of comprehensiveness, clarity and conciseness, researchers found.

“We put them through several scenarios from the exam to see how far they would go in terms of answering questions,” said Calyam, the Greg L. Gilliom Professor of Cyber Security in Electrical Engineering and Computer Science at Mizzou. “Both passed the test and had good responses that were understandable to individuals with background in cyber defense — but they are giving incorrect answers, too. And in cybersecurity, there’s no room for error. If you don’t plug all of the holes and rely on potentially harmful advice, you’re going to be attacked again. And it’s dangerous if companies think they fixed a problem but haven’t.”

Researchers also found that when the platforms were asked to confirm their responses with prompts such as “are you sure?” both systems changed their answers, often correcting previous errors. When the programs were asked for advice on how to attack a computer system, ChatGPT referenced “ethics” while Bard responded that it was not programmed to assist with that type of question.

Calyam doesn’t believe these tools can replace human cybersecurity experts with problem solving expertise to devise robust cyber defense measures, but they can provide baseline information for individuals or small companies needing quick assistance.

“These AI tools can be a good starting point to investigate issues before consulting an expert,” he said. “They can also be good training tools for those working with information technology or who want to learn the basics on identifying and explaining emerging threats.” 

The most promising part? The AI tools are only going to continue to improve their capabilities, he said.

“The research shows that AI models have the potential to contribute to ethical hacking, but more work is needed to fully harness their capabilities,” Calyam said. “Ultimately, if we can guarantee their accuracy as ethical hackers, we can improve overall cybersecurity measures and rely on them to help us make our digital world safer and more secure.”

The study, “ChatGPT or Bard: Who is a better Certified Ethical Hacker,” was published in the May issue of the journal Computers & Security. Co-authors were Raghu Raman and Krishnashree Achuthan.

MU College of Engineering

Related Stories

Ron Mittler

Mizzou scientist reveals muscle weakness connection

Plant biologist Ron Mittler and his team have discovered a possible explanation for muscle weakness caused by aging.

Crystal Lim.

News Release

Boosting sleep and limiting screen time are key steps for unmedicated youth with ADHD, study finds

Mizzou researcher identifies health behaviors that could help unmedicated youth with attention deficit hyperactivity disorder (ADHD) — and the general public — live a healthier lifestyle.

Temple Grandin looks at a student's portfolio

Temple Grandin on autism, education and agriculture

In a recent visit to Mizzou, animal welfare expert Temple Grandin toured South Farm and visited with students and community members.

Runze Sun and Elli Castonguay

Tackling water pollution

Runze Sun and Elli Castonguay recently received the Paul Kufrin Memorial Scholarship from the Missouri Water Center.

Subscribe to

Show Me Mizzou

Stay up-to-date with the latest news by subscribing to the Show Me Mizzou newsletter.

Skip to main content

  • SAS Viya Platform
  • Capabilities
  • Why SAS Viya?
  • Move to SAS Viya
  • Artificial Intelligence
  • Risk Management
  • All Products & Solutions
  • Public Sector
  • Life Sciences
  • Retail & Consumer Goods
  • All Industries
  • Contracting with SAS
  • Customer Stories
  • Generative AI

Why Learn SAS?

Demand for SAS skills is growing. Advance your career and train your team in sought after skills

  • Train My Team
  • Course Catalog
  • Free Training
  • My Training
  • Academic Programs
  • Free Academic Software
  • Certification
  • Choose a Credential
  • Why get certified?
  • Exam Preparation
  • My Certification
  • Communities
  • Ask the Expert
  • All Webinars
  • Video Tutorials
  • YouTube Channel
  • SAS Programming
  • Statistical Procedures
  • New SAS Users
  • Administrators
  • All Communities
  • Documentation
  • Installation & Configuration
  • SAS Viya Administration
  • SAS Viya Programming
  • System Requirements
  • All Documentation
  • Support & Services
  • Knowledge Base
  • Starter Kit
  • Support by Product
  • Support Services
  • All Support & Services
  • User Groups
  • Partner Program
  • Find a Partner
  • Sign Into PartnerNet

Learn why SAS is the world's most trusted analytics platform, and why analysts, customers and industry experts love SAS.

Learn more about SAS

  • Annual Report
  • Vision & Mission
  • Office Locations
  • Internships
  • Search Jobs
  • News & Events
  • Newsletters
  • Trust Center
  • support.sas.com
  • documentation.sas.com
  • blogs.sas.com
  • communities.sas.com
  • developer.sas.com

Select Your Region

Middle East & Africa

Asia Pacific

  • Canada (English)
  • Canada (Français)
  • United States
  • Česká Republika
  • Deutschland
  • Schweiz (Deutsch)
  • Suisse (Français)
  • United Kingdom
  • Middle East
  • Saudi Arabia
  • South Africa
  • New Zealand
  • Philippines
  • Thailand (English)
  • ประเทศไทย (ภาษาไทย)
  • Worldwide Sites

Create Profile

Get access to My SAS, trials, communities and more.

Edit Profile

  • SAS Newsroom
  • Press Releases

Global Market Research: China leads world in GenAI usage while US leads in full implementation

Leaders note lack of understanding, business strategy, sufficient data and regulation preparedness as concerns; data privacy, security and governance are primary challenges.

Generative AI is here to stay. Organizations around the world are enthusiastically using and investing in the technology. But what regions and countries are leading in the use of GenAI? China is in the lead according to a recent global study SAS commissioned with Coleman Parkes Research Ltd. China business decision makers report that 83% of their organizations are using the technology. That’s more than in the United Kingdom (70%), the United States (65%) and Australia (63%). But organizations in the United States are ahead in terms of maturity and having fully implemented GenAI technologies at 24% compared to China’s 19%, and the United Kingdom’s 11%.

What does this mean in terms of the global economic impact of AI and GenAI? In a 2023 report, McKinsey estimated GenAI could add the equivalent of $2.6 trillion to $4.4 trillion annually across a variety of use cases. That’s comparable to the entire GDP of the United Kingdom in 2021. This impact would increase the overall influence of artificial intelligence by 15% to 40%.

Considering these economic implications, SAS and Coleman Parkes targeted 1,600 decision makers across key global markets. Respondents work in a range of industries including banking, insurance, the public sector, life sciences, health care, telecommunications, manufacturing, retail, energy and utilities, and professional services. The smallest organizations surveyed employed a workforce of 500 - 999 people, and the largest employed more than 10,000.

Learn more in the full research report and an interactive data dashboard .

“While China may lead in GenAI adoption rates, higher adoption doesn't necessarily equate to effective implementation or better returns,” said Stephen Saw, Managing Director at Coleman Parkes. “In fact, the US nudges ahead in the race with 24% of organizations having fully implemented GenAI compared to 19% in China.”

Global regions charge ahead with GenAI

Highlights from the global survey results include indicators that signal different regions are on board and starting to adopt GenAI in meaningful ways but at different rates.

“With any new technology, organizations must navigate a discovery phase, separating hype from reality, to understand the complexity of real-world implementations in the enterprise. We have reached this moment with generative AI,” said Bryan Harris, Executive Vice President and CTO at SAS. “As we exit the hype cycle, it is now about purposefully implementing and delivering repeatable and trusted business results from GenAI.”

Where do regions rank in fully using and implementing generative AI into their organization’s processes?

  • North America: 20%
  • Northern Europe: 7%
  • South West and Eastern Europe: 7%

Which regions have implemented GenAI use policies?

  • North America: 63%
  • South West and Eastern Europe: 60%
  • Northern Europe: 58%

To what extent do those planning to invest in GenAI in the next financial year have a dedicated budget?

  • Northern Europe: 91%
  • South West and Eastern Europe: 91%
  • North America: 89%

Note: North America comprise the United States and Canada; LATAM includes Brazil and Mexico; Northern Europe includes United Kingdom/Ireland, Sweden, Norway, Finland, Denmark; South West and Eastern Europe is France, Germany, Italy, Benelux, Spain and Poland; and APAC encompasses Australia, China, Japan and the United Arab Emirates/Saudi Arabia.

Industries and functional divisions embrace GenAI at varying rates

Sabine VanderLinden, CEO and Venture Partner, Alchemy Crew, sees much potential for industries investing in GenAI. “The future of business is being reshaped by generative AI,” she said. “Indeed, the integration of GenAI into business processes – from dynamic profiling in marketing to precision claims insurance – offers unparalleled opportunities for efficiency, personalization, and strategic foresight. Embracing this technology is essential for staying ahead in a highly uncertain and unpredictable competitive market.”

When split into industry segments, the data shows banking and insurance leading other industries in terms of incorporating GenAI into daily business operations across a variety of metrics. Highlights from those findings are below. 

How do specific industries rank in terms of fully implementing GenAI and fully implementing it into regular business processes?

  • Banking: 17%
  • Insurance: 11%
  • Life sciences: 11%
  • Professional services: 11%
  • Retail: 10%
  • Public sector: 9%
  • Health care: 9%
  • Manufacturing: 7%
  • Energy and utilities: 6%

Which industries indicate they already use GenAI daily to some extent?

  • Retail: 27%
  • Banking: 23%
  • Professional services: 23%
  • Insurance: 22%
  • Life sciences: 19%
  • Health care: 17%
  • Energy and utilities: 17%
  • Manufacturing: 16%
  • Public sector: 13%

Which departments inside organizations are using or planning to use GenAI?  

  • Marketing: 85%
  • Finance: 75%
  • Production: 75%

Early adopters are finding plenty of obstacles in using and implementing GenAI

No. 1 on the list of challenges organizations face in putting GenAI to routine use is the lack of a clear GenAI strategy.

Only 9% of leaders responding to the survey indicate they are extremely familiar with their organization’s adoption of GenAI. Of respondents whose organizations that have fully implemented GenAI, only 25% say they are extremely familiar with their organization’s GenAI adoption strategy. Even those decision makers responsible for technology investment decisions aren’t familiar with AI – including those at organizations that are ahead of the adoption curve.

Nine out of 10 senior technology decision makers overall admit they don’t fully understand GenAI and its potential to affect business processes. At 45%, CIOs lead the way with executives who understand their organization’s AI adoption strategy. But only 36% of Chief Technology Officers (CTOs) say they’re fully in the know.

Yet despite this understanding gap, most organizations (75%) say they have set aside budgets to invest in GenAI in the next financial year.

Other challenges organizations face include:

  • Data:  As organizations adopt GenAI, they realize they have insufficient data to fine tune large language models (LLMs). They also realize - once they’re deep into deployment - they lack the appropriate tools to successfully implement AI. Organizations’ IT leaders are mostly concerned about data privacy (76%) and data security (75%).
  • Regulation:  Only a tenth of organizations say they are fully prepared to comply with coming AI regulations. One third of organizations that have fully implemented believe they can comply with regulations. Only 7% are providing a high level of training on GenAI governance. And only 5% have a reliable system in place to measure bias and privacy risks in LLMs.

Although there are obstacles, some early adopters have experienced meaningful benefits already: 89% report improved employee experience and satisfaction; 82% say they’re saving operational costs; and 82% state customer retention is higher.

Keep up with the latest news from SAS by following @SASsoftwareNews on X/Twitter.

SAS is a global leader in data and AI. With SAS software and industry-specific solutions, organizations transform data into trusted decisions. SAS gives you THE POWER TO KNOW ® .

Editorial contacts:

  • SAS Corporate HQ Laura Brumley (214) 803-6692

The Race to Success With Generative AI

University of Missouri

  • Faculty Directory
  • Staff Directory
  • Calendar & Events

Mizzou Engineering

Hackers beware: research shows ai can assist with cybersecurity.

July 09, 2024

A Mizzou researcher and collaborators found that leading chatbots can pass certified ethical hacking exams.

Cybersecurity; Source: Adobe Stock

July 9, 2024 Contact: Janese Heavin,  [email protected]

Chatbots powered by artificial intelligence (AI) can pass a cybersecurity exam, but don’t rely on them for complete protection.

That’s the conclusion of a recent paper co-authored by University of Missouri researcher Prasad Calyam and collaborators from Amrita University in India. The team tested two leading generative AI tools — OpenAI’s ChatGPT and Google’s Bard — using a standard certified ethical hacking exam.

Certified Ethical Hackers are cybersecurity professionals who use the same tricks and tools as malicious hackers to find and fix security flaws. Ethical hacking exams measure a person’s knowledge of different types of attacks, how to protect systems and how to respond to security breaches.

Portrait: Prasad Calyam

ChatGPT and Bard, now Gemini, are advanced AI programs called large language models. They generate human-like text using networks with billions of parameters that allow them to answer questions and create content.

In the study, Calyam and team tested the bots with standard questions from a validated certified ethical hacking exam. For example, they challenged the AI tools to explain a man-in-the-middle attack — an attack in which a third party intercepts communication between two systems. Both were able to explain the attack and suggested security measures on how to prevent it.

Overall, Bard slightly outperformed ChatGPT in terms of accuracy while ChatGPT exhibited better responses in terms of comprehensiveness, clarity and conciseness, researchers found.

“We put them through several scenarios from the exam to see how far they would go in terms of answering questions,” said Calyam, the Greg L. Gilliom Professor of Cyber Security in Electrical Engineering and Computer Science at Mizzou. “Both passed the test and had good responses that were understandable to individuals with background in cyber defense — but they are giving incorrect answers, too. And in cybersecurity, there’s no room for error. If you don’t plug all of the holes and rely on potentially harmful advice, you’re going to be attacked again. And it’s dangerous if companies think they fixed a problem but haven’t.”

Researchers also found that when the platforms were asked to confirm their responses with prompts such as “are you sure?” both systems changed their answers, often correcting previous errors. When the programs were asked for advice on how to attack a computer system, ChatGPT referenced “ethics” while Bard responded that it was not programmed to assist with that type of question.

Calyam doesn’t believe these tools can replace human cybersecurity experts with problem-solving expertise to devise robust cyber defense measures, but they can provide baseline information for individuals or small companies needing quick assistance.

“These AI tools can be a good starting point to investigate issues before consulting an expert,” he said. “They can also be good training tools for those working with information technology or who want to learn the basics on identifying and explaining emerging threats.” 

The most promising part? The AI tools are only going to continue to improve their capabilities, he said.

“The research shows that AI models have the potential to contribute to ethical hacking, but more work is needed to fully harness their capabilities,” Calyam said. “Ultimately, if we can guarantee their accuracy as ethical hackers, we can improve overall cybersecurity measures and rely on them to help us make our digital world safer and more secure.”

The study, “ChatGPT or Bard: Who is a better Certified Ethical Hacker,” was published in the May issue of the journal  Computers & Security.  Co-authors were Raghu Raman and Krishnashree Achuthan.

This story was originally published by Show Me Mizzou . Learn more about cybersecurity and other research areas in electrical engineering and computer science at Mizzou Engineering!

  • artificial intelligence
  • cyber defense
  • cybersecurity
  • Electrical Engineering and Computer Science
  • Missouri Compacts - Research and Creative Works
  • prasad calyam

China leads the world in adoption of generative AI, survey shows

  • Medium Text

An AI (Artificial Intelligence) sign is seen at the World Artificial Intelligence Conference (WAIC) in Shanghai

Sign up here.

Reporting by Eduardo Baptista in Beijing; Additional reporting by Liam Mo; Editing by Clarence Fernandez

Our Standards: The Thomson Reuters Trust Principles. New Tab , opens new tab

Temasek Review in Singapore

Technology Chevron

Samsung Electronics Union workers begin a three-day strike, in Hwaseong

Samsung Elec union in South Korea says will strike indefinitely

The labour union for workers at Samsung Electronics in South Korea will continue to strike indefinitely, it said in a statement on its website on Wednesday.

The logo of Kokusai Electric is pictured in Tokyo

We've detected unusual activity from your computer network

To continue, please click the box below to let us know you're not a robot.

Why did this happen?

Please make sure your browser supports JavaScript and cookies and that you are not blocking them from loading. For more information you can review our Terms of Service and Cookie Policy .

For inquiries related to this message please contact our support team and provide the reference ID below.

IMAGES

  1. Artificial Intelligence in Facebook. Discover 7 ways they use

    facebook ai research

  2. Facebook AI Research launches partnership program

    facebook ai research

  3. Facebook launches AI research laboratory

    facebook ai research

  4. Artificial Intelligence in Facebook. Discover 7 ways they use

    facebook ai research

  5. Milvus Community Conf 2020_Facebook AI Research:An Introduction to Faiss

    facebook ai research

  6. Facebook Opens Two More AI Research Labs

    facebook ai research

COMMENTS

  1. Meta Research · GitHub

    Meta Research is the GitHub account of Facebook AI Research, where they share their open-source code and resources for various AI tasks and applications. Browse their popular repositories, such as fairseq, detectron2, faiss, and more, or explore their latest projects on 3D simulation, audio generation, and more.

  2. FAIR at 5: Facebook Artificial Intelligence Research accomplishments

    Learn how FAIR, a research group created by Facebook five years ago, has advanced the state of AI through open research and open-sourced tools. Explore some of the key projects and publications that showcase FAIR's contributions to the field and the world.

  3. Meta Research

    Meta AI. We're advancing the state-of-the-art in Generative AI, Computer Vision, NLP, Infrastructure and other areas of AI. At Meta, research permeates everything we do. We believe the most interesting research questions are derived from real world problems.

  4. Facebook AI Research

    Facebook AI Research seeks to further our fundamental understanding across the full spectrum of AI topics. We conduct research with a focus on openness,...

  5. AI Research Archives

    Facebook Artificial Intelligence Research (FAIR) seeks to understand and develop systems with human-level intelligence by advancing the longer-term academic problems surrounding AI. Our research covers theory, algorithms, applications, software infrastructure, and hardware infrastructure across deep learning, computer vision, natural language ...

  6. Facebook

    See posts, photos and more on Facebook.

  7. Meta Research

    Meta Research

  8. DeepFace: Closing the Gap to Human-Level Performance in Face

    Yunbo Zhang, Deepak Gopinath, Yuting Ye, Jessica Hodgins, Greg Turk, Jungdam Won

  9. Learn More About Facebook AI Research is conducting artificial

    Facebook is conducting artificial intelligence research to help make the internet more useful for everyone. In this video, meet the head of Facebook AI...

  10. Creative AI: How new tools allow us to be more human and ...

    At Facebook AI Research, six key principles guide their work in creative AI. The overarching aim is to support people in having new experiences to share, create and connect, as well as—in collaboration with open-source communities, academia and partners—to build the future of AI in ways that are fair, protect people and expand opportunities ...

  11. Inside Facebook's artificial intelligence lab

    The team subsequently tasked with creating the future of Facebook is a small, only about 30 research scientists and 15 engineers in total. Labor is divided over three branches: Facebook AI ...

  12. Opinionated and open machine learning: The nuances of using Facebook's

    Machine Learning in Facebook AI Research and in production First off, PyTorch is now officially the one Facebook ML framework to rule them all. PyTorch 1.0 marks the unification of PyTorch and Caffe2.

  13. GitHub

    Fairseq is a Python library for training custom models for text generation tasks such as translation, summarization and language modeling. It supports various sequence modeling papers, pre-trained models, multi-GPU training, mixed precision and fast generation.

  14. Publications

    We present a method for reproducing complex multi-character interactions for physically simulated humanoid characters using deep reinforcement learning. Areas. AR/VR Artificial Intelligence. August 6, 2023. Harrison Jesse Smith, Qingyuan Zheng, Yifei Li, Somya Jain, Jessica K. Hodgins. Paper.

  15. Meta drops AI bombshell: Multi-token prediction models now open for

    The multi-token prediction models are part of a larger suite of AI research artifacts released by Meta, including advancements in image-to-text generation and AI-generated speech detection. This ...

  16. Hackers beware: Research shows AI can assist with cybersecurity

    "The research shows that AI models have the potential to contribute to ethical hacking, but more work is needed to fully harness their capabilities," Calyam said. "Ultimately, if we can guarantee their accuracy as ethical hackers, we can improve overall cybersecurity measures and rely on them to help us make our digital world safer and ...

  17. Global Market Research: China leads world in GenAI usage while US leads

    In a 2023 report, McKinsey estimated GenAI could add the equivalent of $2.6 trillion to $4.4 trillion annually across a variety of use cases. That's comparable to the entire GDP of the United Kingdom in 2021. This impact would increase the overall influence of artificial intelligence by 15% to 40%.

  18. Publications

    Galactic: Scaling End-to-End Reinforcement Learning for Rearrangement at 100k Steps-Per-Second. We present Galactic, a large-scale simulation and reinforcement-learning (RL) framework for robotic mobile manipulation in indoor environments. Areas. Machine Learning. June 17, 2023.

  19. Hackers beware: Research shows AI can assist with cybersecurity

    "The research shows that AI models have the potential to contribute to ethical hacking, but more work is needed to fully harness their capabilities," Calyam said. "Ultimately, if we can guarantee their accuracy as ethical hackers, we can improve overall cybersecurity measures and rely on them to help us make our digital world safer and ...

  20. China leads the world in adoption of generative AI, survey shows

    In a survey of 1,600 decision-makers in industries worldwide by U.S. AI and analytics software company SAS and Coleman Parkes Research, 83% of Chinese respondents said they used generative AI, the ...

  21. Blog

    At Meta, research permeates everything we do. We believe the most interesting research questions are derived from real world problems.

  22. Billionaire Niel's Voice AI Takes on ChatGPT With French Accent

    A French artificial intelligence research lab backed by billionaire Xavier Niel showed off a new voice assistant with a variety of human-like emotions that is similar to a product that OpenAI ...