Simple comparison polling to create utility functions

NunoSempere

Comments 13

Sorted by

New & upvoted

Michael St Jules 🔸

I only took a quick look, but this looks pretty cool.

Here are two small things that might be worth doing:

Allow for negative values
Allow users to input the multiplier by typing (or at least mark some places on the scale with values), since if you have a number in mind, it's a bit of work to find it.

NunoSempere

2. Is now implemented.

1. is a bit tricky because the "is x times as valuable as" relation is kind of weird for negative inputs

NunoSempere

Cheers, I've added both suggestions as Github issues to remember.

David Johnston

I would be interested in this same concept but framed so as to compare personal utility instead of impersonal utility, because I feel like I'm trying to estimate other people's values for personal utility and aggregate them in order to get an idea of impersonal utility. It seems tricky, though:

- How many {50} year old {friends/family members/strangers} would you save vs {5} year old {friends/family members/strangers}?

This seems straightforward, except maybe it's necessary to add "considering only your own benefit" if we want personal utilities that we can aggregate instead of a mixture of personal and impersonal utilities.

- How many 50 year old yourselves would you save vs 5 year old yourselves?

This one doesn't make much sense to me, and if I try to frame it differently, e.g.

"imagine a group of 50-74 year olds and a group of <5 year olds. There's a treatment that saves {X} 50 year olds and {Y} 5 year olds, and the <5 year olds dictate who gets it. What is the minimum X:Y for there to be a 50% chance of choosing the 50-74 year olds?"

My first thought is there's no way to sensibly answer this question because 3 year olds are incredibly stubborn and also won't understand.

Anyway, don't know if this is very helpful, but that was my first response to the app and the result of my first few minutes thinking about it.

skluug

I like this! UI suggestion: instead of "The first option is 5x as valuable as the second option", I would insert the sentence between them in the middle: "...is 5x as valuable as...". Or if you're willing to mess up marginal/total utility, you could format it as "One [X] is worth as much as five [Y]", which I think would help it be more concrete to most people.

NunoSempere

Done!

skluug

Cool!!

NunoSempere

Hey, this is a good idea, but it turns out it's slightly tricky to program. I'll get around to it eventually, though

Ozzie Gooen

I just wanted to give my take on some of this:

The web app is neat to experiment with the ideas and help us build intuitions.
That said, I think the key ideas (not the web app in particular), are the main insight here.
The current implementation is a solid first step, but I think we’re still a ways from having something that’s fun to use. My guess is that it will require some sophisticated UX / UI work to do a job that’s good enough for this to be useful in production. (If anyone reading this wants to try, let one of us know!)
I also think it’s important to figure out how to allow for negative values. This is annoying, but so it goes.

One thing I learned over the course of this, is that we probably don’t actually want big tables of utility estimates. Or, more specifically, is that we want functions that we can query as “how does X compare to Y”?, and they give us the correct amount. These can trivially convert to tables, but are subtly better. The reason for this is that they’ll handle correlations between items.

10 apples might be exactly 10 times as good as 1 Apple; 10 oranges 10x as 1 orange. We want a query of “how much better is 10 apples compared to 1 apple” to return exactly “10x”, and similar for oranges. If we tried putting them all into a common unit, like “pear equivalent”, then we wouldn’t get this property.

I’m not sure what the best format is to store this sort of data. Maybe some cluster analysis or something. There must be some clever mathematics for this somewhere, it it’s not clear to either of us.

mako yass

I've been thinking about this sort of preference aggregation problem for a few years. I think the best way to do it, that we have right now, is to form a graph with edges weighted by the comparison strength, then do pagerank, or something like it. Rank entries by their pagerank scores.

But I've been working towards something more precise, and this might be novel: Parallel and serial reducer functions (and another one, a "crosslink" reducer, I believe), sort of like the reducer functions you'd use to make judgements about electronic circuit graphs, which, given two nodes in the graph, work together to reduce the edges of the graph to a single edge between them, and then you know how those edges compare.

I have a really firm and consistent and unambiguous sense of how the reducers should behave, but I'd need to spend some time with a mathematician and a whiteboard to come up with formalizations. I'm pretty confident we'd produce something legit, if that were set up, though!

In case you're wondering how it handles cycles: If the cycle is even it resolves that each option in it is equal. If the cycle is a little bit lopsided then it creates a ranking, but with high controversy. If it's extremely lopsided then it creates a ranking with low controversy.
If you put a gun to its head, it'll always be able to tell you which nodes are at the top of the ranking, but it can also tell you when there's a lot of uncertainty between them.

NunoSempere

My sense is that the mathematized version would be much more valuable (for instance, I could incorporate it into my tooling), but also harder to obtain than you might realize.

gwern

I dunno if it's that hard. Comparisons are an old and very well-developed area of statistics, if only for use in tournaments, and you can find a ton of papers and code for pairwise comparisons. I have some & a R utility in a similar spirit on my Resorter page. Compared (ahem) to many problems, it's pretty easy to get started with some Elo or Bradley-Terry-esque system and then work on nailing down your ordinal rankings into more cardinal stuff. This is something where the hard part is the UX/UI and tailoring to use-cases, and too much attention to the statistics may be wankery.

NunoSempere

Comparisons are an old and very well-developed area of statistics

Yeah, but it's not clear to me that discrete choice is a good fit for the kind of thing that I'm trying to do (though I've downloaded a few textbooks, and I'll find out). I agree that UX is important.

Comments