(Replying to PARENT post)
This hits pretty close to home.
(Replying to PARENT post)
...but in the meantime, here's an obligatory and shameless plug for donating to the Internet Archive[1] (tax-deductible in the US), or better yet making a recurring monthly donation so they can more accurately forecast revenue for the year, or better still getting your employer to make a nice big donation to this crucial bit of Internet memorybanks.
And as for Archive Team, we're always looking for a few good geeks.[2] Run an instance of the Warrior on spare cloud servers, or help patch and ship code at GitHub.[3]
[1] http://archive.org/donate/
(Replying to PARENT post)
I've written about this before, and even right now I'm not sure where I stand exactly, except that tweaking the algorithms to compensate for bias is definitely not the right answer: if you look at the mirror and don't like what you see, you don't draw on top of the mirror to accentuate the result! You go on a diet!
I liked the idea of data gardening, but the thought of going-to-communities is daunting. I get tired even thinking about it.
Regarding living beyond walled gardens:
> Publish your texts as text. Let the images be images. Put them behind URLs and then commit to keeping them there. A URL should be a promise.
But people already do that! The question now is to turn to why people do otherwise. I personally do not understand the reason people say, post long blogposts on Facebook, but I do understand for services like Medium.
For example, I'm extremely tempted to write on Medium because it provides the network effects of readers clicking on tags to read next. So the question is how do we democratize that?
(Replying to PARENT post)
Also, it's funny how the net changes, how unthinkable it is to have a social network that doesn't slice up people's data and use it to advertise to them now compared to how anti-advertising LiveJournal was back then. Not convinced it's a change for the better.
(Replying to PARENT post)
Also I don't necessarily understand Ceglowski's stance on why we shouldn't use deep learning and should avoid surveillance on the web. I don't take issue with becoming a datapoint in Facebook's web of people because nothing bad has happened or can happen from me giving Facebook my data. When most people speak out about the data that's being collected about Facebook and Google users they say they're "worried about what could happen" but then never list any bad things that they're actually afraid of. The speaker falls to this issue too. Ceglowski says:
>I worry about legitimizing a culture of universal surveillance.
But then never explains what bad could happen from legitimizing that culture. Maybe I'm completely missing the point of the talk? Please explain what I'm missing if I'm actually missing something.
(Replying to PARENT post)
The author's concerns over machine learning are well-founded. The best option I've been able to identify to ameliorate some of the concerns is focusing on the population that will be suppressed. Once the model returns the desired recall / precision, drawing samples from the excluded population with a rigorous acceptance standard can help validate whether you've simply built a model around your biases. Couple that with allowing an opponent to validate a randomly-selected sample and you've cleared up a lot of the uncertainty in the model.
It's not perfection, but perfection is a very difficult standard.
(Replying to PARENT post)
And that scale is exactly the state of the internet. There is so much data available to study and understand, that we absolutely need better tools, like machine learning or whatever we want to call it, to help us keep up. Shit's moving faster than our human perception can handle, especially for those who didn't grow up with the internet.
Yes the data analyctic tools we have right now are premature— like fast food to our productized minds— but they will improve rapidly, as our taste for quality improves.
But sure demonizing the things you don't like is one step on the path to learning what's truly valuable.
(Replying to PARENT post)
Um, I'm sorry, but unsupervised learning and deep learning are not the same.
(Replying to PARENT post)
(Replying to PARENT post)
Reminds me of the phrase "graduate student descent" for training neural networks...
I've been noticing more casual dismissiveness towards grad students lately. They are certainly often treated as the grunt laborers of academia, in areas where career prospects are downright stupid. I generally feel it would be more productive to at least pretend that they're being trained to be independent, aggressive researchers in their own right, though.
(Replying to PARENT post)
So true.
(Replying to PARENT post)
What's not clear to me is why companies who don't seem to have any need for machine learning team (i.e. a subscription box company) are looking to hire one.
Surely part of this can be pinned down to the hype associated with ML that may well die out, but the proliferation of these tools doesn't bode well for Maciej's dream of a weird, creative, and interesting internet.
(Replying to PARENT post)
Using machine learning on the other hand is a safe bet. It is much easier, I would assert, to write machine learning code to organize data than to curate a community of humans to organize data. The ML approach will do pretty good even if it isn't the best, which is why it's what everyone is switching to.
Keeping with the author's example, is it easier to organize erotic fanfic with a computer, or enable a community to do it without spiraling out of control?
(Replying to PARENT post)
People tend to move towards the more mall-like areas of the Internet due to spam and abuse that they don't want to deal with. This can be low-level stuff, or (as in the cases of Kreb himeself) sometimes the attackers get out the big guns, and you need to run for cover.
And that's why we're hanging out here, after all, and not in some unmoderated forum. And even here, post on certain subjects and conversation quickly degenerates.
I think we do need a wider variety of spaces to hang out, though. No set of rules works for everyone. And if you do want 4chan, you know where to find it.
(Replying to PARENT post)
I sounds like he's saying ephemeral content is worthless and should be shunned.
I, and hundreds of millions of others, disagree. You want a bland, awful, boring society? Easy: make everything you do stick around forever—like a promise. And then watch the world self-police as the lifeblood drains out of it.
You'll get…Facebook. No thanks.
(Replying to PARENT post)
(Replying to PARENT post)
(Replying to PARENT post)
It's already been mentioned, but this guy needs to get out a bit more.
The internet is a city. There's the specialist shops (HN), the bustling malls (Reddit, YT), the shady back alleys (4chan, 8chan etc.), the historical districts (Usenet, Archive.org), the cafes (IRC, ICQ, Slack, etc.). To their credit, the author is more knowledgeable than most, however.
I see so many dismiss the internet as just Facebook, or YouTube, discuss trolling as if it's a single phenomenon, and it's a recent thing, associated with Social Media. So many think that there's an internet culture: there isn't: there's a set of almost infinite numbers of overlapping, interlinked cultures. I can even map out the origins and historical influences of a few. There are even a few who think that social media sites are good forums of discussion. The poor sods: the Usenet was a better discussion forum than Facebook ever was, and the Usenet's not that great.
If you really want to see what the internet is like (that isn't advice for the author: I'm pretty sure the mall analogy doesn't encompass his internet experience, and is merely an analogue I find odd), explore. See it all, in all of its weird, wacky, zany, jokey, serious, offensive, manic, smart, stupid, brilliant, insane glory. I promise you, you won't be dissapointed.
People ask me why I'm not on social media. It's because social media is boring. Unlike Reddit, 4chan, and the rest, not much interesting happens. Unlike HN, I'm not likely to be intellectually stimulated, or learn something new. Unlike static sites, I don't get to see that kind of wild creativeness that personal webspace tends to invite in hackers, nerds, and others who know what makes the web tick. I don't want to see what you ate, I don't want to see your cat, I don't want to hear banal details about your everyday life. I want to hear something intersting, new, and original. I want to hear the next Ze Frank, or Tom Ridgewell, or Simon Travaglia, or Steve Yegge, or RMS, or PG, or Ryan Dahl, and you can bet I won't on a site with a signal:noise ratio that high.
People also ask why I'm fascinated with the internet. My response is, why wouldn't I be? It's a catalogue of decades of human creativity and interaction. It's open mike night at the largest club in the world, which is also a discussion forum, and a shady back alley, and a convention. It is - to borrow and butcher Sir Terry's words - like being blindfolded and drunk at several different parties at once.
But, in what it rapidly becoming the sign-off on my incoherent, long-winded ramblings that are really only tangentially connected to the topic at hand, maybe I'm just totally mad.
EDIT: tried to clarify that I wasn't trying to insult the author. Not my intent, but it seemed to come off that way. It still does, but less so, and I prefer not to edit my old content too much. Also, I just checked out pinboard. Pinboard is amazing, and I am impressed.
Basically, don't take this as anything more than a tangential, incoherent ramble started by an analogy the author used which I found unrepresentative. Because that's what it is.
(Replying to PARENT post)
(Replying to PARENT post)
I wonder if the author truly understands "Machine Learning", what are his qualifications? A degree in Art History, and some "programming experience" aren't very assuring. E.g.
>> "The names keep changing—it used to be unsupervised learning, now it’s called big data or deep learning or AI"
WTF?? The author should enroll in a beginner Machine Learning course on Udacity or Coursera before making philosophical statements about fields he has zero clue about.
It seems the only skill the author has is piecing together meaningless arguments that appeals to average HN users incapable of distinguishing between informed opinions and pseudo-scientific rants. Hell at least bad graduate students have to give examinations, read papers and make original contributions that get peer reviewed (otherwise they fail/get-kicked-out/drop-out). Not like this guy who does not understands difference between "supervised" and "unsupervised" machine learning, yet feels comfortable in making "prophetic" statements about machine learning.
Also
>>> "These techniques are effective, but the fact that the same generic approach works across a wide range of domains should make you suspicious about how much insight it's adding."
What does he means by "same generic approach". If we assume he is implying specific algorithms then we have a good "No free lunch" theorem that shows that a single algorithm is not effective across all domains. Now if by "generic approach" the author mean "machine learning" in general then its as ridiculous as saying
"Mathematics is effective, but the fact that the same Mathematical approach works across a wide range of domains should make you suspicious about how much insight it's adding."
The entire article is filled with "truthiness" and "feel-good" statements, which fall apart on closer examination.
(Replying to PARENT post)
I agree completely. This is something we should be cognizant of.