(Replying to PARENT post)
This is already a thing.
In fact the rumour is that the US govt discovered the identity of Bitcoin's Satoshi using this.
https://medium.com/cryptomuse/how-the-nsa-caught-satoshi-nak...
๐ค_nedR๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
I keep forgetting the general public is not aware of these things.
Data is like nuclear waste. Everything you do online leaves a pattern of behavior that is unique to you. Your only saving grace is no one cares about you specifically, until they do.
๐คggggtez๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
It was already known that a simple Markov Chain was used to detect another author had written a chapter inside a book. It was in 2003 I think, unfortunately I cannot find a reference about this. Just to tell that Markov chains are a very basic and old ML method quite efficient for this kind of task.
๐คyogsototh๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
This sort of analysis is older than Tolkien. There are pretty substantial processing requirements to do it at scale and it's pretty inaccurate. In the future people who say controversial things will use short sentence long statements to render this sort of analysis useless.
๐คTheOperator๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
There are rephrasing services available, presumably for helping users plagiarise. Some are laughable bad but possible helpful, while othersare quite good.
Eg:
https://quillbot.com/app
๐คlostlogin๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
Isn't this the reason the grep was created? It was used to determine which parts of the Federalist Papers were written by which author.[0]
Considering this occurred in 1974, I can only imagine that techniques for de-anonymizing authors have gotten much better due to how much written text individuals post on social media sites, like hn. Uh oh.
๐คmmcwilliams๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
It's already a thing, see how the FBI caught Silkroad admin. Although not in the automated fashion that you suggest, I am pretty sure the algos are already in use.
๐คcocochanel๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
๐คXelbair๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
Aren't college students everywhere already exposed to this with every text, due to "plagiarism detection" software?
๐คkzrdude๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
Possible solution: run your text through a different machine translator for each account. Make minor corrections for cohesiveness.
๐คsalutonmundo๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
But the whole cat and mouse game hasn't really started yet. Once people find out what the algo looks at they can try to game it. Eg if you know or looks for the same phrases like "first of all" you can stop using that. Or if it looks at specific errors you can sprinkle it in one text but not another.
๐คlordnacho๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
Wasn't that how they caught the Unabomber? I saw a documentary about the guy who caught him by using this sort of analysis, although his method was quite analog (scanning through written letters and Unabomber's correspondences to the press).
๐คmetastew๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
Only a matter of time before AI-powered 'prose anonymiser' is developed.
Then you can just run all your naughty words through the russian styliser.
๐คSimple_Guy๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)
simple & obscure solution: google translate to another language and back to original.
๐คviko-h๐6y๐ผ0๐จ๏ธ0
(Replying to PARENT post)