Ideal Point Distribution for Lists of Different Lengths

andyd1010 · Post by **andyd1010** » Wed Jan 10, 2018 10:28 pm

BleuPanda wrote:
Beatsurrender24 wrote:Hi there. Thanks a lot for creating this spreadsheet – it's a really useful way of discovering great songs that I may have missed last year.

I'm a little confused by some of the maths, though. From what I can tell from a quick look over the formulae, the final song on a list receives the same score as the final song on any other list, regardless of how long the list is. So, for example, Something Just Like This by The Chainsmokers was named the 50th best song of 2017 in Uproxx's Top 50, and it therefore has approximately the same score as On + Off by Maggie Rogers, which Pitchfork named the 100th best song of 2017 in their Top 100.

Doesn't this mean that the spreadsheet is biased towards longer lists? The song that Pitchfork named as the 50th best of 2017 (Show You the Way by Thundercat) would have received twice the score of Something Just Like This, even though they were both at the same position (number 50) on their respective lists.

If you follow that through to its logical conclusion, that would then mean that, for example, being named the third best song on Reactor 105.7's list (106 entries long) is worth almost 11 times more than being named the third best song on The Music's list (only 10 entries long), heavily biasing the spreadsheet in Reactor 105.7's favour. This seems a little unintuitive to me. Surely a song's position on a list should be worth the same score irrespective of how long the list is, or how many other songs are beneath it?

Anyway, those are just my thoughts. It's your spreadsheet, so you can compile it however you want. Apologies if I'm reading the maths completely wrong and misrepresenting your calculations. Thanks again for the great work!

The problem is, if you were to simply make every position worth the same no matter the size of the list, it harms a lot of songs that don't make smaller lists simply due to lack of space. I believe the main purpose to having such a scale is so that not appearing on a list is generally the same value. I'm not sure what the numbers in this case are, but imagine a scale where rank 1-100 were given 100-1 points depending on their rank. In a list with 100 songs, the difference between being #100 and #101 is essentially 1 point. If a list only contains 10 songs, the difference between #10 and #11 would be 91 points! That would give a lot of weight towards lists that frankly give us less information to actually work with, and gives an unfair advantage to those few songs that manage to make short lists. The issue isn't raw value as much as the difference assigned between positions.

BleuPanda wrote:
andyd1010 wrote:Interesting. So longer lists allocate way more points than shorter lists - per song in addition to the larger number of songs. I didn't realize that. I can see the pros and cons of both approaches. I have a similar project I've mentioned before, and I initially went with Sweepstakes Ron's approach before switching to Beatsurrender24's approach. But I just don't give any points to songs that don't appear on a list, which makes it a little different than the problem BleuPanda mentions about assigning an appropriate amount of points to songs that didn't make the cut.

Well, it's not like anyone's assigning points to songs that don't appear on a list; the problem is how much higher than zero the songs that are on the list are getting. Being #3 on a list of the top 100 songs is different than being #3 on a top 10. In the end, what really matters is the point difference, not the total value of points. So, in my example distribution, every song ranked outside the top 100 would be 98 points less than song #3; but there are 97 songs that have smaller differences! The problem is that smaller lists have an additional 90 songs that we gain no information on. To me, with a top 100 list, I can go ahead and agree that the point distribution between an imagined #101 and #200 would be so small that it wouldn't matter; but for a top 10 list, I think the imagined difference between the non-existent #11 and #101 does matter. Somewhere out there is a song that should be getting 90 points that isn't (if I'm using 100 as the ideal size).

The easiest solution is to lower that size difference; it's messy, but you need some method that accommodates for different list sizes. In other words, it's better to treat every unranked song as #11 if the list only has 10 songs. Why should they be treated as #101, other than the fact that other lists go to 100? That's essentially what using a flat rate suggests. A flat rate statistically gives more weight to smaller lists; which I hope we would agree is a bad thing, right?

In a way, you are always assigning every song points; zero is a value. It's all relative.

Of course, this is just me giving my own analysis for why there should be different weighing based off the size of a list; Henrik likely has entirely different reasons.

I decided to make a new thread so this doesn't end up cannibalizing the EOY songs thread.

First of all I want to say how much I appreciate the work you've done, Sweepstakes Ron, and to everyone else who has worked hard on this site to create these lists we all enjoy! I'm not saying I necessarily think you should change your method, but I think it's worth discussing especially since I use a different method with my own spreadsheet, and maybe I can be convinced that my method isn't ideal.

Well BleuPanda, it does seem like this system assigns points to songs that don't appear on a list - you're saying the songs outside that hypothetical list all get credit for being #11, etc. But that isn't what I meant with my post - A longer list already has more weight by virtue of being longer, even if each song of equal rank is weighted the same. So then to devalue the lower ranks of smaller lists creates an even bigger difference in the impact those lists have on the final results.

I guess you're arguing otherwise because of this: Say a 100-song list and a 10-song list have the exact same top 10. You'd argue that with equal point allocation the 10-song list has "more weight" because those 10 songs gain so much on the field with the 10-song list, while with the 100-song list many of their competitors in the spreadsheet appear later in the list and get a reasonable amount of points, so the top-10 songs gain less drastically on the field. Am I understanding the argument correctly or is there more to it?

My feeling is that songs that don't appear on small lists aren't at too much of a disadvantage since only 10 songs gain on them. Regardless of the allocation, a song that does not appear on either list is always impacted more negatively by the inclusion of the 100-song list, since 100 songs gain on it and the top songs are still gaining at least as much on the 100-song list as they do on the 10-song list, if not more, right? So other than my one example I don't see how smaller lists could be seen as having more weight.

How does it help to come up with hypothetical points for songs that might have appeared if lists were longer, rather than just using a formula that spits out a number of points for each rank regardless of list size and giving 0 points to songs that aren't listed (isn't that what we do for most of our polls, where lists of various lengths are accepted)?

Thanks in advance for the clarification!

Post by **Henrik** » Fri Jan 12, 2018 9:22 am

This is going to be a bit complicated. If it's hard to follow, go directly to the last paragraph.

Here's how the scoring works in Sweepstakes Ron's eoy songs spreadsheet:

Let's take American Songwriter and HUMBLE. as an example. A small part of the algorithm in the hidden cell L6 is LN(1+OM(ISBLANK(T6);T$3;T6)). This algorithm is part of the denominator of the whole formula, so the smaller value the better. If HUMBLE. had been included as #1, this part of the formula would have been LN(1+1)=0.69, if HUMBLE. had been #2, the score would have been LN(1+2)=1.10, and so on. If HUMBLE. had been at the end of the list, at #25, the score would have been LN(1+25)=3.26. But HUMBLE. isn't on the list and therefore the algorithm becomes LN(1+T3). In cell T3 there's the formula COUNT(T$6:T$5000)*2+5. Since there are 25 songs, the value is 25*2+5=55. So the part of the formula in cell L6 is LN(1+55)=4.03.

Instead of talking about the score for songs that are included in lists, I'd say that the key question is how much worse is the score for the songs that aren't included. The formula in row 3 is (number of songs)*2+5 for all lists. Then this value is included in the formulas in cells L6-R6. If there are 10 songs in a list, #10 gets the value LN(1+10)=2.40 and the songs outside the list gets the value LN(1+25)=3.26. If there are 100 songs in a list, #100 gets the value LN(1+100)=4.62 and the songs outside the list gets the value LN(1+205)=5.33.

These values are then added up for all lists within each region and then totally. So what matters is the difference between the numbers. For a list of 10 songs the score difference between #10 and a song outside the list is 3.26-2.40=0.86. For a list of 100 songs the score difference between #100 and a song outside the list is 5.33-4.62=0.71. #87 in a list of 100 songs gets the same score difference as #10 in a list of 10 songs (5.33-4.47=0.86).

Hence, you could say that the benefit of appearing as #10 in a list of 10 songs is 0.86 and the benefit of appearing as #100 in a list of 100 songs is 0.71. Not a huge difference perhaps. This could be adjusted by changing the (number of songs)*2+5 formula in row 3, either by lowering the multiplier to something lower than 2, or increase the additive part to something higher than 5. However, my reasoning behind the choice of multiplication by 2 and addition by 5 was that there are arguably more thoughts from critics behind the longer lists.

Beatsurrender24 · Post by **Beatsurrender24** » Mon Jan 15, 2018 12:57 pm

This seems like a really overly complicated methodology. I imagine that a lot of people who use the spreadsheet don't quite follow how it even all works.

Couldn't the maths just be something a little simpler? The absolute simplest that I could think of would be, for example, that if an album or song is placed at position p, then it receives p^–1 points. So the #1 album/song on a listing would get one point, #2 would receive ½ a point, #4 would get ¼ a point, and so on. Then you'd just add together all the points received from all the different lists to get an album/song's position on the aggregated spreadsheet.

Now, if you felt that this puts too great an emphasis on songs that are higher on lists (e.g. maybe you disagree that being placed at #1 on one list should count the same as being placed at #4 on four), you could always decrease that –1 to something else. Something like p^–0.6 would be pretty close to what Metacritic use for their EOY poll of polls, for example.

This wouldn't solve your issue of songs being disadvantaged by not being included on lists, but personally I'm not convinced that that's much of a problem.

Post by **Henrik** » Mon Jan 15, 2018 1:26 pm

Beatsurrender24 wrote:This seems like a really overly complicated methodology. I imagine that a lot of people who use the spreadsheet don't quite follow how it even all works.

Couldn't the maths just be something a little simpler? The absolute simplest that I could think of would be, for example, that if an album or song is placed at position p, then it receives p^–1 points. So the #1 album/song on a listing would get one point, #2 would receive ½ a point, #4 would get ¼ a point, and so on. Then you'd just add together all the points received from all the different lists to get an album/song's position on the aggregated spreadsheet.

Now, if you felt that this puts too great an emphasis on songs that are higher on lists (e.g. maybe you disagree that being placed at #1 on one list should count the same as being placed at #4 on four), you could always decrease that –1 to something else. Something like p^–0.6 would be pretty close to what Metacritic use for their EOY poll of polls, for example.

This wouldn't solve your issue of songs being disadvantaged by not being included on lists, but personally I'm not convinced that that's much of a problem.

Hi Beatsurrender24!

Is it a problem if the methodology is complicated? Excel does all the work.

Beatsurrender24 · Post by **Beatsurrender24** » Mon Jan 15, 2018 2:35 pm

My only issue is, like I said earlier, the current method seems to bias the final rankings towards lists that are longer, which seems unfair to me. I guess that's where we'd have to agree to disagree. I mean, hey, it's your spreadsheet.

andyd1010 · Post by **andyd1010** » Mon Jan 15, 2018 10:32 pm

Thanks for the explanation, Henrik. That is a little complicated, but it mostly makes sense. I think I'm somewhere between the two of you with my opinion on longer lists. Length might correlate with how much thought goes into them, but probably not as much as this formula would suggest.

BleuPanda · Post by **BleuPanda** » Mon Jan 15, 2018 11:20 pm

The thing you have to realize is, if you treat the top ten of every list as having the same weight no matter its size, you are essentially saying rank #11 to #100 doesn't statistically matter. You can't build a list of 10,000 songs with that mindset. Why would you want to reduce those songs to irrelevance simply because some other critics can't be bothered to make longer lists? You're literally arguing for certain information to be treated as statistically irrelevant, and I don't understand why you would want to do that.

Like, which do you think is better, a song that is consistently ranked between #30 and #50 on thirty major publications, or a song that is ranked in the top 10 on two sources and otherwise ignored by everyone else? Does that first song really deserve an irrelevant number of points simply because some lists don't reach that length? Why should lists be limited by the smallest among them? If a publication came out and ranked a single album, should the new standard change so that #1 is the only number that really matters? Should that lucky #1 get a 100 point lead over everything else because one publication couldn't be bothered to tell us who else mattered that year?

Like, you have to have a center to your relevance point; you're essentially arguing for a top 10 as the ideal number, and any list that goes longer than that is wasting its time. What about a list with 500 songs with your method? Does #500 just not matter at all? Why do you think it's okay to declare it meaningless? You're looking at key information and essentially throwing it out simply because the other method is 'easier.' I don't think that's a proper way of combining information from various sources. You should never ignore data.

I also wonder how you would accommodate such a size difference with the 'simple' method. From what I gather, you're essentially arguing the gulf between a top 10 and its unranked should be the same as a top 100 and its unranked. So, if we include a top 500, that gulf is now 500 points across the board? So, if one list ranks 500 songs, that top 10 list gives its 10 featured songs a 490 point lead over every other song?

Also, just because you're not actively making an attempt to give unranked songs a value doesn't mean you aren't; otherwise, a list could literally only be used to compare other songs within that same list. You literally cannot combine multiple lists without assigning a value to those ranked on one and unranked on another. 'Zero' in no way means 'no points have been assigned.' You're actually making a very large extrapolation by doing so; you are in effect saying "no song would be ranked number eleven if this list was expanded to feature more songs."

So, I really want to know, what statistical argument is there for lists with less information to be weighed the same as longer lists, and how do you justify marking provided data as virtually irrelevant that is caused by this process? Longer lists need more weight because we have to give points to those extra works.

Jirin · Post by **Jirin** » Tue Jan 16, 2018 10:51 pm

I'm for complicated methodologies if they lead to more accurate results. I will trust the statistician.

FWIW, here's how I would design methodologies.

For initial critic score aggregation:
I would bootstrap the distribution of each publication to a bell curve. So instead of averaging raw ratings, you'd be averaging how many standard deviations the rating is above or below the publication's usual rating. Also, it would have to adjust for skew somehow, and regress toward the mean for albums with few ratings.

For end of year lists:
I'd do something similar to Henrik, try to estimate what each rating means to an individual publication. I'm not sure I would put the lowest rating on the list just one higher than all non-lists, but maybe try to consider what the publication sees as the distance between the bottom of the list and any arbitrary album that came out. Not sure at this point how to do that. Maybe say, there are X songs out there that anybody considered, and bootstrap the high end of a gaussian distribution onto a song population of that size.

For example, suppose we consider there to be 200 songs everyone considered (This would have to be a relatively low value because we're looking for the songs that the vast majority of publications would have looked at, not all the songs anyone would have looked at). Then #1 would get the Z score corresponding with being 99.5th percentile, #2 would get the Z score corresponding with being 99th percentile, and so on.

This would call scores to fall something like this:
1. ~2.6
2. ~2.3
3. ~2.2
.
.
32. ~1
71. ~0.5
100. 0

For all time lists, number would have to be higher like in the 1000-2000 range, because for an all time list that's closer to the full number of albums all publications will look at and consider. If N is 1000, it'd fall more like this:
1. ~3.1
2. ~2.9
5. ~2.6
10. ~2.3
100. ~1.3
200. ~0.85
And so on

andyd1010 · Post by **andyd1010** » Wed Jan 17, 2018 4:32 am

BleuPanda wrote:The thing you have to realize is, if you treat the top ten of every list as having the same weight no matter its size, you are essentially saying rank #11 to #100 doesn't statistically matter. You can't build a list of 10,000 songs with that mindset. Why would you want to reduce those songs to irrelevance simply because some other critics can't be bothered to make longer lists? You're literally arguing for certain information to be treated as statistically irrelevant, and I don't understand why you would want to do that.

Like, which do you think is better, a song that is consistently ranked between #30 and #50 on thirty major publications, or a song that is ranked in the top 10 on two sources and otherwise ignored by everyone else? Does that first song really deserve an irrelevant number of points simply because some lists don't reach that length? Why should lists be limited by the smallest among them? If a publication came out and ranked a single album, should the new standard change so that #1 is the only number that really matters? Should that lucky #1 get a 100 point lead over everything else because one publication couldn't be bothered to tell us who else mattered that year?

Like, you have to have a center to your relevance point; you're essentially arguing for a top 10 as the ideal number, and any list that goes longer than that is wasting its time. What about a list with 500 songs with your method? Does #500 just not matter at all? Why do you think it's okay to declare it meaningless? You're looking at key information and essentially throwing it out simply because the other method is 'easier.' I don't think that's a proper way of combining information from various sources. You should never ignore data.

I also wonder how you would accommodate such a size difference with the 'simple' method. From what I gather, you're essentially arguing the gulf between a top 10 and its unranked should be the same as a top 100 and its unranked. So, if we include a top 500, that gulf is now 500 points across the board? So, if one list ranks 500 songs, that top 10 list gives its 10 featured songs a 490 point lead over every other song?

Also, just because you're not actively making an attempt to give unranked songs a value doesn't mean you aren't; otherwise, a list could literally only be used to compare other songs within that same list. You literally cannot combine multiple lists without assigning a value to those ranked on one and unranked on another. 'Zero' in no way means 'no points have been assigned.' You're actually making a very large extrapolation by doing so; you are in effect saying "no song would be ranked number eleven if this list was expanded to feature more songs."

So, I really want to know, what statistical argument is there for lists with less information to be weighed the same as longer lists, and how do you justify marking provided data as virtually irrelevant that is caused by this process? Longer lists need more weight because we have to give points to those extra works.

I'm basically playing devil's advocate and wondering why this formula is better than what I use or what the forum uses for its polls. BleuPanda, I actually drew inspiration from the 2015 All-Time Song Poll that you ran, and I used an extremely similar formula in my project: 150*(1+20)/(Place+20)

I'm curious, if list length makes such an impact, why didn't you use a formula that accounted for it then? Do we use one for any of our forum polls? If not, why not?

From your example, let's compare Song A that places 5th on 2 lists compared to Song B that places 40th on 30 lists. Song A: 252 points. Song B: 1,575 points. So yes, Song B is perceived to be the far superior song, which is obvious from the data even without accounting for list length.

I'm not sure I'm on the same page with you about 11-100 not statistically mattering. For a 10-song list, there is no rank 11-100. I'm not saying "no song would be ranked number eleven" if it were expanded, but we have no idea what would be ranked in that range. For a 100-song list, there is obviously a rank 11-100, and that range makes a big impact on the data - most of the total points allotted by the list come from that range, and it is certainly not true to say it doesn't statistically matter. There's also no data being ignored. This data goes a long way toward filling in the blanks that short lists don't cover at the lower ranks.

I do see what you're saying about how much of a boost a 1-song list would give to that 1 song, which is an extreme example of what I guessed your problem was with this method in my initial post. Granted, our forum polls do have minimums, so that would account for that particular issue.

I'm definitely not arguing that 10 is the ideal number. There is no ideal number - the formula has a consistent pattern that doesn't break for a threshold at 10 or any other number.

A rank of 500 gets you just over 6 points. I think that's fairly reasonable, and not because it's easier but because the rank is low. Admittedly, in my project I have tended to give longer lists more weight, but not necessarily. There are many lengthy radio lists that I don't think deserve high weight, and certain very reputable sources have published short lists that I still feel deserve high weight. It gets subjective, so it's difficult to achieve the most accurate result while following hard rules.

So to answer your main question, even though the top ranks are weighted the same, the fact that longer lists dole out hundreds more points makes their total weight substantially higher, and that is the justification. As Beatsurrender24 pointed out, is it fair that Treblezine's #100 song is worth as much as Rolling Stone's #50 song? That makes Treblezine's list worth way more than Rolling Stone's. I think the fact that 50 extra songs are included in the data is enough before making all of them worth more on top of that. For our 2017 song poll, the numbers are capped at 50. I assume the thinking is that, beyond 50, you're getting to songs that aren't truly outstanding anymore. I could have listed 200, but I don't think that would have made my list 4 times as valuable at every rank.

I have an easier time buying the argument that lists of <50 songs should be worth proportionally less due to the lack of data, but once lists get larger than 50, I don't think there should be much of a difference, if any. For all-time lists, it's easier to find more than 50 great songs, so in that case you probably have a stronger argument for valuing 100-song lists significantly more than 50-song lists.

I hope that clears up some of where I'm coming from.

Acclaimed Music Forums

Ideal Point Distribution for Lists of Different Lengths

Ideal Point Distribution for Lists of Different Lengths

Re: Ideal Point Distribution for Lists of Different Lengths

Re: Ideal Point Distribution for Lists of Different Lengths

Re: Ideal Point Distribution for Lists of Different Lengths

Re: Ideal Point Distribution for Lists of Different Lengths

Re: Ideal Point Distribution for Lists of Different Lengths

Re: Ideal Point Distribution for Lists of Different Lengths

Re: Ideal Point Distribution for Lists of Different Lengths

Re: Ideal Point Distribution for Lists of Different Lengths