123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316 |
- title: Possible routes for distributed anti-abuse systems
- date: 2017-04-04 18:00
- author: Christine Lemmer-Webber
- tags: federation, anti-abuse
- slug: possible-distributed-anti-abuse
- ---
- <p>
- I work on federated standards and systems, particularly
- <a href="https://www.w3.org/TR/activitypub/">ActivityPub</a>.
- Of course, if you work on this stuff, every now and then the question
- of "how do you deal with abuse?" very rightly comes up.
- Most recently <a href="https://mastodon.social/">Mastodon</a> has gotten
- some attention, which is great!
- But of course, people are raising the question,
- <a href="https://brian.mastenbrook.net/blog/2017/federating-failure.html">can federation systems really protect people from abuse</a>?
- (It's not the first time to come up either; at LibrePlanet in 2015 a
- number of us held a "social justice for federated free software systems"
- dinner and were discussing things then.)
- It's an important question to ask, and I'm afraid the answer is,
- "not reliably yet".
- But in this blogpost I hope to show that there may be some hope for
- the future.
- </p>
- <p>A few things I think you want out of such a system:</p>
- <ul>
- <li>
- It should actually be decentralized.
- It's possible to run a mega-node that everyone screens their content
- against, but then what's the point?
- </li>
- <li>
- The most important thing is for the system to prevent attackers from
- being able to deliver hateful content.
- An attack in a social system means getting your message across, so
- that's what we don't want to happen.
- </li>
- <li>
- But who are we protecting, and against what?
- It's difficult to know, because even very progressive groups often don't
- anticipate who they need to protect; "social justice" groups of the past
- are often exclusionary against other groups until they find out they need
- to be otherwise (eg in each of these important social movements, some
- prominent members have had problems including other social justice groups:
- racist suffragists, civil rights activists exclusionary against gay and
- lesbian groups, gay and lesbian groups exclusionary against transgender
- individuals...).
- The point is: if we haven't gotten it all right in the past, we might not
- get it all right in the present, so the most important thing is to allow
- communities to protect themselves from hate.
- </li>
- </ul>
- <p>
- Of course, keep in mind that <em>no</em> technology system is going
- to be perfect; these are all imperfect tools for mitigation.
- But what technical decisions you make do also affect who is
- empowered in a system, so it's also still important to work on
- these, though none of them are panaceas.
- </p>
- <p>
- With those core bits down, what strategies are available?
- There are a few I've been paying close attention to
- (keep in mind that I am an expert in zero of these routes at present):
- </p>
- <ul>
- <li>
- <strong>Federated Blocklists:</strong> The easiest "starter" route.
- And good news!
- If you're using the
- <a href="https://www.w3.org/TR/activitypub/">ActivityPub standard</a>,
- there's
- <a href="https://www.w3.org/TR/activitypub/">already a Block activity</a>,
- and you could build up group-moderated collections of people to block.
- A decent first step, but I don't think it gets you very far; for one thing,
- being the maintainer of a public blocklist is a risky activity;
- trolls might use that information to attack you.
- That and merging/squashing blocklists might be awkward in this system.
- </li>
- <li>
- <strong>Federated reputation systems:</strong>
- You could also take it a step further by using something like the
- <a href="https://www.stellar.org/how-it-works/stellar-basics/explainers/">Stellar consensus protocol</a>
- (more info in <a href="https://www.stellar.org/papers/stellar-consensus-protocol.pdf">paper form</a>
- or even
- <a href="https://www.stellar.org/stories/adventures-in-galactic-consensus-chapter-1/">a graphic novel</a>).
- Stellar is a cryptographically signed ledger. Okay, yes, that makes it a
- kind of blockchain (which will make some peoples' eyes glaze over, but
- technically a signed git repository is also a blockchain), but it's not
- necessarily restricted to use of cryptocurrencies... you can track any
- kinds of transactions with it.
- Which means we could also track blocklists, or even less binary
- reputation systems! But what's most interesting about Stellar is that
- it's also federated... and in this case, federation means you can
- <em>choose</em> what groups you trust... but due to math'y concepts that
- I occasionally totally get upon being explained to me and then forget the
- moment someone asks me to explain to someone else, consensus is still
- enforced within the "slices" of groups you are following.
- You can imagine maybe the needs of an LGBT community and a Furry
- community might overlap, but they might not be the same, and maybe you'd
- be subscribed to just one or both, or neither.
- Or pick your other social groups, go wild.
- That said, I'm not sure how to make these "transactions" not public in
- this system, so it's very out there in the open, but since there's a
- voting system built-in maybe particular individuals won't be as liable
- for being attacked as individuals maintaining a blocklist are.
- Introducing a sliding-scale "social reputation system" may also introduce
- other dangerous problems, though I think Stellar's design is probably the
- least dangerous of all of these since it probably will still keep abusers
- out of a particular targeted group, but will allow
- marginalized-but-not-recognized-by-larger groups still avenues to set up
- their own slices as well.
- </li>
- <li>
- <strong>"Charging" for distributing messages:</strong>
- Hoo boy, this one's going to be controversial!
- This was suggested to me by someone smart in the whole distributed
- technology space.
- It's not necessarily what we would normally consider real money that
- would be charged to distribute things... it could be a kind of "whuffie"
- cryptocurrency that you have to pay.
- Well the upside to this is it would keep low-funded abusers out of a
- system... the downside is that you've now basically powered your
- decentralized social network through pay-to-play capitalism.
- Unfortunately, even if the cryptocurrency is just some "social media fun
- money", imaginary currencies have a way of turning into real currencies;
- see paying for in-game currency in any massively multiplayer game ever.
- I don't think this gives us the power dynamics we want in our system, but
- it's worth noting that "it's one way to do it"... with serious side
- effects.
- </li>
- <li>
- <strong>Web of trust / Friend of a Friend networks:</strong>
- Well researched in crypto systems, though nobody's built really good
- UIs for them.
- Still, a lot of potential if the system was somehow made friendly and
- didn't require showing up to a nerd-heavy "key-signing party"...
- if the system could have marking who you trust and who you don't (and not
- just as in terms of verifying keys) built as an elegant part of the UI,
- then yes I think this could be a good component for recognizing who you
- might allow to send you messages.
- There are also risks in having these associations be completely public,
- though I think web of trust systems don't necessarily have to be
- public... you can recurse outward from the individuals you do already
- know.
- (<i>Edit:</i> My friend <a href="http://www.draketo.de/">ArneBab</a>
- suggests that looking at how Freenet handles its web of trust
- would be a good starting point for someone wishing to research
- this.
- I have 0 experience with Freenet, but
- <a href="https://github.com/freenet/wiki/wiki/Web-Of-Trust">here</a> are
- <a href="https://emu.freenetproject.org/pipermail/devl/2016-April/038916.html">some</a>
- <a href="http://freesocial.draketo.de/fms_en.html">resources</a>.)
- </li>
- <li>
- <strong>Distributed recommendation systems:</strong>
- Think of recommender systems in
- (sorry for the centralized system references)
- Amazon, Netflix, or any of the major social networks
- (Twitter, Facebook, etc).
- Is there a way to tell if someone or some message may be relevant to you,
- depending on who else you follow? Almost nobody seems to be doing research
- here, but not quite nobody; here's one paper:
- <a href="https://people.eecs.berkeley.edu/%7Ejfc/'mender/IEEESP02.pdf">Collaborative Filtering with Privacy</a>.
- Would it work?
- I have no idea, but the paper's title sure sounds compelling.
- (<i>Edit:</i>
- ArneBab also points out that
- <a href="http://credence-p2p.org/">credence-p2p</a> might also be useful
- to look at.
- <a href="http://credence-p2p.org/paper.html">Relevant papers here.</a>)
- </li>
- <li>
- <strong>Good ol' Bayesian filtering:</strong>
- Unfortunately, I think that there's too many alternate routes of attacks
- for just processing a message's statistical contents to be good enough,
- though I think it's probably a good component of an anti-abuse system.
- In fact, maybe we should be talking about solutions that can use multiple
- components, and be very adaptive...
- </li>
- <li>
- <strong>Distributed machine learning sets:</strong>
- Probably way too computationally expensive to run in a decentralized
- network, but maybe I'm wrong.
- Maybe this can be done in a the right way, but I get the impression
- that without the training dataset it's probably not useful?
- Prove me wrong!
- But I also just don't know enough about machine learning.
- Has the right property of being adaptive, though.
- </li>
- <li>
- <strong>Genetic programs:</strong>
- Okay, I hear you saying,
- "what?? genetic programming?? as in programs that evolve?"
- It's a field of study that has quite a bit of research behind it,
- but very little application in the real world... but it might be a good
- basis for filtering systems in a federated network
- (I'm beginning to explore this but I have no idea if it will bear fruit).
- Programs might evolve on your machine and mine which adapt to the changing
- nature of social attacks.
- And best of all, in a distributed network, we might be able to send our
- genetic anti-abuse programs to each other... and they could breed and make
- new anti-abuse baby programs!
- However, for this to work the programs would have to carry part of the
- information of their "experiences" from parent to child.
- After all, a program isn't going to very likely randomly bump into finding
- out that a hateful group has started using "cuck" as a slur.
- But programs keep information around while they run, and it's possible that
- parent programs could teach wordlists and other information to their
- children, or to other programs.
- And if you already have a trust network, your programs could propagate their
- techniques and information with each other.
- (There's a risk of a side channel attack though: you might be able to find
- some of the content of information sent/received by checking the wordlists
- or etc being passed around by these programs.)
- (You'd definitely want your programs sandboxed if you took this route,
- and I think it would be good for filtering only... if you expose output
- methods, your programs might start talking on the network, and who knows
- what would happen!)
- One big upside to this is that if it worked, it <em>should</em> work in a
- distributed system... you're effectively occasionally bringing the
- anti-abuse hamster cages together now and then.
- However, you do get into an ontology problem... if these programs are
- making up wordlists and binding them to generated symbols, you're
- effectively generating a new language.
- That's not too far from human-generated language, and so at that point
- you're talking about a computer-generated natural language... but I think
- there may be evolutionary incentive to agree upon terms.
- Setting up the "fitness" of the program (same with the machine learning
- route) would also have to involve determining what filtering is useful /
- isn't useful to the user of the program, and that's a whole challenging
- problem domain of its own (though you could start with just manually
- marking correct/incorrect the way people train their spam filters with
- spam/ham).
- But... okay by now this sounds pretty far-fetched, I know, but I think it
- has some promise... I'm beginning to explore it with a derivative of some
- of the ideas from
- <a href="http://faculty.hampshire.edu/lspector/push.html">PushGP</a>.
- I'm not sure if any of these ideas will work but I think this is both the
- most entertainingly exciting and crazy at the same time.
- (On another side, I also think there's an untapped potential for
- roguelike AI that's driven by genetic algorithms...)
- There's definitely one huge downside to this though, even if it
- <em>was</em> effective (the same problem machine learning groups have)...
- the programs would be nearly unreadable to humans!
- Would this really be the only source of information you'd want to trust?
- </li>
- <li>
- <strong>Expert / constraint based systems:</strong>
- Everyone's super into "machine learning" based systems right now, but
- it's hard to tell what on earth those systems are <em>doing</em>, even
- when their results are impressive (not far off from genetic algorithms,
- as above! but genetic algorithms may not require the same crazy large
- centralized datasets that machine learning systems tend to).
- Luckily there's a whole other branch of AI involving "expert systems" and
- "symbolic reasoning" and etc.
- The most promising of these I think is the
- <a href="https://groups.csail.mit.edu/mac/users/gjs/propagators/">
- propagator model</a> by Sussman / Radul / and many others
- (if you've seen the constraint system in SICP, this is a grandchild of
- that design).
- One interesting thing about the propagator model is that it can come to
- conclusions from exploring many different sources, and it can <em>tell
- you how</em> it came to those conclusions.
- These systems are incredible and under-explored, though there's a catch:
- usually they're hand-wired, or the rules are added manually (which is
- partly how you can tell where the conclusions came from, since the
- symbols for those sources may be labeled by a human... but who knows,
- maybe there's a way to map a machines concept of some term to a human's
- anyway).
- I think this won't probably be adaptive enough for the fast-changing
- world of different attack structures... but! but! we've explored a lot of
- other ideas above, and maybe you have some <em>combination</em> of a
- reputation system, and a genetic programming system, and etc, and this
- branch of study could be a great route to glue those very differing
- systems together and get a sense of what may be safe / unsafe from
- different sources... and at least understand how each source, on its
- macro level, contributed to a conclusion about whether or not to trust a
- message or individual.
- </li>
- </ul>
- <p>
- Okay, well that's it I think!
- Those are all the routes I've been thinking about.
- None of these routes are proven, but I hope that gives some evidence that
- there are avenues worth exploring... and that there is likely hope for
- the federated web to protect people... and maybe we could even do it
- better for the silos.
- After all, if we <em>could</em> do filtering as well as the big orgs,
- even if it were just at or nearly at the same level (which isn't as good
- as I'd like), that's already a win: it would mean we could protect
- people, and also preserve the autonomy of marginalized groups... who
- aren't very likely to be well protected by centralized regimes if push
- really does come to shove.
- </p>
- <p>
- I hope that inspires some people!
- If you have other routes that should be added to this list or you're
- exploring or would like to explore one of these directions, please
- <a href="https://dustycloud.org/contact/">contact me</a>.
- Once the <a href="https://www.w3.org/wiki/Socialwg/">W3C Social Working
- Group</a> wraps up, I'm to be co-chair of the following Social Community
- Group, and this is something we want to explore there.
- </p>
- <p>
- <i>Update:</i> I'm happy to see that
- <a href="https://news.ycombinator.com/item?id=14038700">the Matrix folks
- also see this</a> as "the single biggest existential threat" and
- "a problem that the whole decentralised web community has in common"...
- apparently they already have been looking at the Stellar approach.
- More from their
- <a href="https://matrix.org/blog/wp-content/uploads/2017/02/2017-02-04-FOSDEM-Future.pdf">
- FOSDEM talk slides</a>.
- I agree that this is a problem facing the whole decentralized web, and
- I'm glad / hopeful that there's interest in working together.
- Now's a good time to be implementing and experimenting!
- </p>
|