Not every video on the internet is real,
and the fake ones are multiplying.
That's thank to the spread of Deepfakes.
Deepfakes are videos that have been altered
using machine learning, a form of artificial intelligence,
to show someone saying or doing something
that they did not in fact do or say.
The results can be great fun.
Take for example these hilarious clips
of Nicholas Cage starring in movies that he was never in,
but Deepfakes can also be a tool for harassment,
and a way to spread political misinformation.
To learn more about the Deepfakes era we live in,
I spoke with Sam Gregory, who tracks these videos
at the human rights non-profit Witness.
What is a Deepfake and where do they come from?
Why are we talking about them all of a sudden?
What Deepfakes are are the next generation
of video and audio manipulation, and sometimes images,
they're based on artificial intelligence,
and they make it much easier to do a range of things.
So, what people think of as a Deepfake is typically
the face swap, right?
You take the face of one person and you transfer it
onto another person.
But we might also think within the same category
of other forms of synthetic media manipulation,
like the ability to manipulate someone's lips,
and perhaps sync them up with a fake or real audio track,
or the ability to make somebody's body move,
or appear to move, in a way that is realistic
but is in fact computer generated.
And all of this is driven
by advances in artificial intelligence,
particularly the use of what are known as
generative adversarial networks.
And in these adversarial networks, they have the capacity
to set two artificial intelligence networks
competing against each other, one producing forgeries,
the other competing to detect the forgeries.
And as the forgeries improve, they do it based on
this competition between two networks.
So this is one of the big challenges underlying Deepfakes is
that often they are improving because of the nature
of the inputs.
There are so many different ways you could use
that technology.
What are we seeing out there in the wild?
For the moment,
they're primarily non-consensual sexual images.
Probably up to 95% of the Deepfakes out there
are images of celebrities,
or they're non-consensual images of ordinary people
being shared on p*rn sites,
or being shared in closed messaging.
We have started to see some other cases
of Deepfakes being used in other contexts,
targeting women journalists or civic activists
with images that appear to show them in
sexual situations.
We've also started to hear people using the
it's a Deepfake excuse.
So in the small number of political level cases
where there was potentially a Deepfake,
you see people weaponizing the phrase, it's a Deepfake
and it almost in that case it's really a version
of the same phrase, it's fake news.
And Sam, tell us how easy this technology has become
to access?
You mentioned that it's improved.
Can anyone do this?
It's still not at the point that anyone can do
a really convincing face swapping fake.
There's code available online,
there are websites you can go to that will allow you
to create a Deepfake.
You know, some of those Deepfakes will be imperfect,
but we also know that imperfect Deepfakes
can still cause harm.
So it's getting more accessible
because it's getting commercialized, monetized,
and what's become clear in the last six months is that
Deepfakes, and also other synthetic media
like audio generation, is getting better and better,
and requiring less training data, less examples
you need to generate the data,
all of which mean we're gonna get more and more
of this content, and it's probably gonna be
of better and better quality.
In Congress there has been concern
about Deepfakes being used to distort political campaigns,
maybe even the 2020 Presidential Campaign.
Right, there's clearly vulnerabilities
for political candidates
for the last minute surprise of the compromising video.
A lot of attention goes to political candidates,
there are detection methods being developed
for those political candidates
to protect them from Deepfakes.
And the reason people worry about the advances in Deepfakes
and in other synthetic media,
is we've really seen quite significant progress
in the last six to 12 months,
we've seen a decline in the amount of training data needed
down to a few images
for some of the face expression modification.
We've seen people started to combine video manipulation,
like the lips, with simulation of audio.
And we're starting to see the commercialization of this
into apps.
And as things go to mobile,
that increases them as they become apps,
they obviously get much more available.
And this is why it puts the pressure to be saying
how are we making sure that as these get more available
they're detectable,
and that app makers also think about detection
at the same time as they think about creation
because we have a Pandora's Box there,
and we've already seen how a Pandora's Box like that
can be unleashed.
What possible solutions are people talking about?
You mentioned the idea of a technical solution,
I guess the ideal thing
would be something like a spam filter,
spam filtering is pretty good these days,
you don't see much spam,
could we do that for Deepfakes, just block them out?
We could, but we'd have to define what we think is
a malicious Deepfake, right?
Because Deepfakes and all this genre of synthetic media
really they're related to computational photography,
doing a funny face filter on an app.
Now you might say that's fun, that's my granny,
or you might say that's great,
I think it's great that that's a satire of my president,
or you might look and say I wanna check this
against another source.
What we're not doing actually at the moment
is telling people how to detect Deepfakes
with technical clues.
And the reason for that is,
each of those glitches is the current algorithmic
kind of Achilles's Heel, right?
It's the problem of the current version of the algorithm
but as we put different data into the algorithm
and as we recognize that's a problem,
it's not gonna do that.
So for example, a year ago people thought that Deepfakes
didn't really blink, and now you see Deepfakes that blink.
Now there are technical solutions.
They are all gonna be partial solutions,
and we should want them to be partial solutions.
There's a lot of investment in detection,
using advanced forms of media forensics.
The problem with all those approaches is that
they always are at a disadvantage,
the attacker has the advantage there with the new technique,
and can learn from the previous generations of creation,
and forgery, and forgery detection.
Substituting a kind of technical check mark
for human reasoning is not a great idea.
Systems like that get broken,
they're a absolute honeypot for hackers
and people who wanna disrupt it,
and also because these things are complex, right?
Something may look real, and that may not matter to us
that it has had some manipulation
and you don't wanna give that a cross,
and something may have a tick mark but in fact
the context is all wrong.
I tend to think of detection as being the thing
that at least gives us some signals,
some signals that might help us say
actually there's something suspicious here,
I'm gonna need to use my media literacy,
I'm gonna have to think about it.
Well, that's interesting.
You mention the question of
how people should think differently
now that we're in the Deepfake era, you might call it.
I guess it was never a good idea to believe everything
you saw on the internet,
and now you can't believe anything you see?
What's the right mindset to have?
I think it's also a problem generally
with the misinformation disinformation discussion,
is we've convinced people they can't believe anything online
when the reality is much of what's shared online
is true, or true enough.
It does increase the pressure on us to recognize
that photos and text are not necessarily trustworthy,
we need to use our media literacy on them
to assess where it came from, is there corroboration,
and what's complicated about video and audio
is we have a different cognitive reaction,
we don't have the filters
we've either built or cognitively have
around text and photo,
so I think a real onus on both platforms
who have the capacity to be looking for this,
but also people who've built the tools
that are starting to create this
to feel a responsibility to yes,
build create tools for creation,
but also to build tools for detection,
and then we can plug that in to a culture that
where we're really saying you do need media literacy,
you do need to look at content and assess it,
and I don't think that's the same
as saying it's the end of truth.
I think it's saying we have to be skeptical viewers,
how do we give them technical signals,
how do we build the media literacies
that will deal with this latest generation of manipulation.
Well Sam, thank you very much for you help
understanding Deepfakes.
Thank you Tom, I appreciate the interview.