How the Lobsters front page works
62 points by atharva
62 points by atharva
I'm not the author of this code, but I've been maintaining it for a while. This is a great writeup. I'm pretty sure it's based on the Reddit link ranking algorithm, which was well-described by Evan Miller here and in a followup.
It would be great to see a similar writeup and simulator for comment ranking ("confidence"), which is a different algorithm. It's been tinkered with a lot over the years. I'm not sure it makes much sense anymore, and I'm pretty sure the tag weighting is just applied wrong with addition instead of multiplication. And I covet a very clever HN feature for getting attention to new threads where (at least for the first couple days) the threads are sorted best-first, but with the newest interleaved, so it goes (newest comment, top thread, 2nd newest comment, 2nd thread, etc). We haven't seen the unfortunate Reddit gaming of "hijacking the top comment to say..." but I'd like to encourage late comments more. (This is also why /active is next to the logo.)
I'm open to improvements on these. One big difference between us and Reddit/HN is that we don't have downvotes to indicate disagreement. jcs required that downvoters pick a from a list of predetermined reasons, but the familiar UI didn't work out. Commenters can see the flags on their comments, so we had a lot of discussions immediately distracted into meta conversations about why a comment was flagged. I changed the UI to move it from a down arrow over to "flagging" and spent a lot of time emphasizing in comments and messages that it should be used for things that need mod attention rather than disagreeing. Even a pretty recent change to enforce that users can reply to or flag a comment but not both (stream 1 and 2).
I'm also open to improvements on improving the tone and quality of conversations, though that's harder to nudge than a couple lines of javascript.
(sorry, but I did not decide to look at the code, so just offering my observations below)
The biggest issue I as a commenter notice is the perennial "fastest gun in the west problem". In fact, I didn't even realize that Lobsters interleaved best and new comments on threads; is the order actually "first best, first newest, second best, ..."? In this thread, for example, I see your comment at the top and @Forty-Bot's comment (which is "first best, second best" unless I'm mistaken --- is there perhaps some hiding done if you are the one who made the newest comment?).
I know you say that "We haven't seen the unfortunate Reddit gaming of 'hijacking the top comment to say...'..." but I definitely have found myself post a reply to the discussion in the top comment which is debatably better served as a top-level comment (sometimes I do both and link to the top-level comment). Analyzing my own behavior, I find I want to be where the discussion is at --- and sometimes that ends up being in the children of one of the top-level comments.
Also (again anecdotally) early snark seems to have a much higher chance of ending up as the top-voted comment, whereas later snark is more likely to stagnate.
IMO the biggest issue with fast replies is that upvotes beget more upvotes (ditto for comments). I noticed that there is some sort of initial hiding of karma on posts with low votes; I wonder if this could apply to all comments regardless of karma? I unfortunately have to leave (otherwise my reply would be a lot longer with other ideas and comments), so this is more a collection of hurried thoughts and a "seconded" to your request for more discussion on comment ranking.
I'm also open to improvements on improving the tone and quality of conversations, though that's harder to nudge than a couple lines of javascript.
Though I don't use the site nowadays, I really like the first sentence of Reddiquette.
Remember the human. When you communicate online, all you see is a computer screen. When talking to someone you might want to ask yourself "Would I say it to the person's face?"
I am calling the bluffs of all the self-proclaimed blunt, socially awkward, or whatever else communicators: there are many things I see said here that you simply would not say to someone else's face. I am a firm believer that disagreement is not a reason to be unkind; put differently, you can always say what you want to say in a way that conveys good intentions. And for all the know-it-alls, if you're so sure that you are right, you should be excited by the opportunity to teach, not preach.
I've given up on the front page algorithm. Popular stories stay on top for way too long before getting demoted. It's common for a story to stay on the front page for days, and it's pretty annoying to have to scroll past the same story for the 3rd or 4th time. At this point I only browse new, which frustratingly takes two clicks to get to from the front page.
Popular stories stay on top for way too long before getting demoted. It's common for a story to stay on the front page for days.
This is what attracts me to Lobsters and makes me prefer it over HN. Stories on Lobsters have time to "breath": people from different timezones have time to see them, contribute to them, have meaningful back-and-forth discussions going over days.
In HN a post has a 10 to 30 minutes window to make to the top. Stories move so fast that there is now a European HN and an American HN (and to a lesser extent an Asia/Australia HN), where non overlapping groups of people superficially engage for a few hours before moving on to the next story on the front page.
Is that really the algorithm? I think the site just moves more slowly, there aren't that many new posts. Unfiltered /newest often shows posts that are over 24 hours old and if you have some tags filtered it can easily go back several days.
Is there a reason you don't click "hide" on the ones where you don't want to continue following the discussion?
I generally only hide posts that I find repulsive (but that commentators here love to comment on). I not infrequently use the search bar to look for posts, so I don't want to intentionally degrade my search results. Sometimes posts have great technical content, but overstay their welcome.
frustratingly takes two clicks to get to from the front page.
Why not add a direct bookmark BTW?
I read HN like that, I filter out everything below some score, and sort the rest by date, so as soon as I hit a known title, I know I can stop reading.
Authors submitting their own content get a tiny boost, which is mildly surprising given the otherwise strict self-promo rules.
I had an initial surprise as well upon reading this in your article, but I think it makes sense only because of the self-promo rules. I am more inclined to read stories that are self-submitted, after all. It's a treat getting to ask the author questions or reply to their post.
Yet, my experience on the website has been far from ideal. For me, this is rooted in a disconnect of values with the group most engaged on the site, whose votes and discussions drive the climate. ... Studying the algorithm has shown me that disengaging would make my problem worse—a single user's participation can be worth a lot.
I am sorry you feel that way and I indeed hope you engage more. I find myself on the less popular side of various culture wars fought here sometimes and it really does seem like a futile effort trying to justify my position (and likewise, sometimes I have to stop myself feeling emboldened when I know my opinion is shared). Know that there are those of us who read and upvote dissenting opinions --- and that there are those of us who still value your opinion even if we do not agree with it.
Really neat analysis and visualisation. I have been thinking about the ranking algorithm recently for a potential side project, so I had a dig through the maths. I actually think there's a small inaccuracy which traces back to some redundant code, although it's possible that I have misunderstood something.
The sign negates the effect of comment upvotes when the story scores zero, and make them contribute negatively to the rank when the story scores below zero.
As far as I can tell, the sign parameter doesn't do anything! The variable is intended to flip the sign of order if score is negative. However, if score is negative:
cpoints <= score (because cpoints is the minimum of score and another value), and therefore |score| <= |cpoints|
|score + 1| < |score| (score is a count of votes, so its smallest negative value is -1)Therefore, |score + 1| < |cpoints| - so, knowing that cpoints, is negative, we can infer that |score + 1| + cpoints is less than 0.`
This means that
order == log10(min(|score + 1| - cpoints, 1))
== log10(1)
== 0
So if the score is negative, the value of order is guaranteed to be 0 - so the sign is redundantly multiplying 0 by 1 or -1.
I've played around with the little example model the author provides and it seems to agree with this - when the story has a negative score, the order always comes out as 0, regardless the number of comments.
If I'm not mistaken this is a fun case of some redundant logic in the code - I guess at some point there were scenarios where the order could be non-zero with a negative score, which would have the effect the op suggests.