Concerns with Sourcecred

When sourcecred was originally implemented, it was “just an experiment we’ll review & be able to drop whenever.” By now it seems to have made its way into the core mechanics & roadmap. It seems so deeply ingrained we can no longer detach from it, and we never actually reviewed it or consider dropping it.

Now, I’m not saying we should drop it, but I do still have some concerns.

I hope most of them are due to my lack of knowledge, and I’m sure some of them are.

  1. Just because something created a lot of emotional reaction, doesn’t mean it created a lot of value.

People will react more to short things that create emotional reaction vs long articles that explain things in-depth. You might be pissed when someone’s memes start creating more XP than your well thought out content or code.

Here’s an example:

  1. Just because nobody is building on what you committed to github, doesn’t mean it’s not valuable. Maybe you wrote it so well it never has to be revised.
    On the other hand, people building & changing what you committed might mean exactly that you wrote something of low quality or incomplete, so it needed to be improved upon a lot.
  • As far as I understand, Sourcecred will award the latter more?

  • Same goes for referencing things on Discord. Just because you’re replying to someone doesn’t mean they said or did something useful. It could be quite the opposite as it was fairly recently.

  1. You say it is not a black box system, but as far as I know there is no way for me to see how the people that have Seeds got those Seeds. I can check the weights but that is not a proof of anything else, because the weights can be correct and cred still wrongly attributed.

  2. One issue that I haven’t heard properly addressed is that as the price of Seeds grows through the bonding curve, XP should be redeemable for less & less Seed.

  • Otherwise there’s no incentive for workers to get in early.
  • Another thing that makes this even worse is reactions creating cred.

As we get more and more people engaging on Discord and Discourse, there will be more reactions to things, creating, for the latest content, even more XP than the early contributors earned.
Theoretically we could just reduce the weights from reactions as we scale, but as sourcecred recalculates everything retrospectively, it would make no difference.

  1. Looks like a blanket solution. Pull requests on the wiki should probably be awarded less than the ones on Interspace, especially as the wiki content will mainly be built on forums. But you can assign different values to different repos?

  2. Same question about channels on the Discord and categories on Discourse, posts in different ones can be made to weigh differently, right?

  3. Looks like it might turn out more restrictive than we thought & incentivize inefficiency.

  • We built a wiki based on github so sourcecred can read contributions, but… is this really the interface we can expect meme-creators and writers to contribute through?
  • Where embedding memes looks like this? :joy:

To be fair, it can be edited in an external markdown editor and the code imported here. So it’s somewhat easily mitigated. Adds to complexity but fine.

  • I bulk uploaded 10 memes I made. If I was uploading them & creating pull requests 1 by 1, I’d be getting 10x cred I’m getting now… Incentivizing me to act inefficiently.
  • If we want cred to easily flow to other contributors, we should be collaborating on writing docs on the forums. There’s much more friction to doing that than using Google Docs.
  1. On to attributions. On the call following InterCon, we spent 45 minutes discussing and putting down weights on each contribution.
  • It required all of us to be on the call, and imagine doing this with more people on a bigger project. Not scalable
  • Moreover, these contributions are still not calculated into cred.
  • If we did it the old way, it would have required 2-5 minutes of each participant’s time and could have been done async. We’d be done with it all the way back then.
  • Dependencies are easy to write now, but might get out of hand as we move forward because everything requires almost everything before it.
1 Like

Hi, not sure if this helps but I got “mixed feelings” after seeing the sourcecred project… as a developer I thought there are only two options:

a) this is big
b) it’s a fad

In any case I didn’t decide to learn more about the details because something (at least for me) seemed off… probably the sheer impossibility of doing this right even in theory.

Maybe you should be careful and as a first step disentangle from it and when it actually is proven some more, integrate carefuly again.

But in any case: this type of system is invasive. I would not use it ever. It’s from a more dystopian future where just everything has a point attached to it. Having bad ideas is not a bad thing when these ideas are discarded early :slight_smile:

I’m just one developer, not sure about the others (developers or not). Let the best future win :slight_smile:

1 Like

Hey @peth-- you raise some great concerns here. A lot the concerns you bring aren’t fundamental issues with SourceCred at a conceptual level, they are limitations from how young and immature SourceCred is as a project.

The core assumption of SourceCred (the paradigmatic kernel) is:

  • value doesn’t exist in isolated artifacts or events, and you can’t measure value by counting events
  • instead, you need to look at the relationships between contributions to understand their value
  • contributions are valuable if they are connected to, or depended on by, other valuable contributions

This core set of assumptions is really flexible. Contributions can be anything: features, forum posts, in-person conversations, moments of inspiration, new designs, wiki edits, pep talks, pull requests, you name it.

However, right now we’re not using that system at its full potential, because our data comes from existing web2.0 platforms like Discourse, GitHub, Discord, etc. So we’re stuck just with posts, pull requests, comments, reactions, as our “world” of contributions. We’re missing a lot of the most important contributions, because emotional labor, vision building, planning, community organizing, etc, don’t leave clear digital traces.

However: a big, big belief of the SourceCred project is that we need to launch an imperfect system as soon as possible, i.e. as soon as we have a solid foundation. This way, we can start dogfooding and learning from the system, collecting real user feedback. Otherwise, we risk building a system that sounds good in theory but isn’t usable in practice (and smart people have wasted years of their lives on these kinds of wild goose chases).

We started that dogfooding last September, when we launched The CredSperiment. We knew it still had a lot of rough edges, but wanted to see what would happen anyway. And the initial results have been really good! We’ve been running it for over 6 months now, and have distributed >$300k worth of SourceCred grain (backed by funding from Protocol Labs). The system hasn’t worked perfectly, and we’ve needed to change the weights a few times. However, the experiment has been a success, both in that the community gained strength and cohesion during the CredSperiment, and that we’ve learned a ton about how to improve SourceCred.

For example, take SourceCred’s focus on reactions as a form of cred minting. Reactions are problematic, but they’re actually better than what we started with. At first, we were minting cred directly on Discourse posts and topics. But that encouraged a race-to-the-bottom of focusing on quantity over quality. Based on what we saw and what we learned, we switched to minting cred on likes instead of posts. This is more robust than minting on raw activity, but still has problems.

We’re already deploying the next generation of cred minting, which is called the initiatives plugin. Basically, with the initiatives plugin, the community collectively agrees which high-level contributions were valuable, and how high-level contributions (‘supernodes’) depend on each other, and on specific pieces of work. Our first batch of initiatives focus on rewarding the work that went into organizing CredCon 2020 (it was missed since it didn’t happen on Discourse or GitHub), and on rewarding the ongoing work in setting up our official documentation.

In the coming months, we’re also going to introduce boosting, which will enable decentralized & incentive-compatible cred minting. To put it in MetaGame parlance: using boosting, any contributor or sponsor can spend their SEEDs in order to mint new XP to a contribution or initiative. In return, the booster gets a share of the XP that contribution earns. So, if you see someone has just launched a sick new project that you want to support, you can boost it with your seeds, which rewards everyone working on it with more XP. And if the project goes on to be hugely important in MetaGame, you will get a ton of XP for having been its early booster.

I encourage you to do a review, and to keep digging into the strengths and weaknesses of SourceCred. A lot of people think SourceCred is some sort of magical system that will make all your community’s value judgements for you. But I don’t think making such a system is actually possible. What SourceCred is instead is a tool–a protocol for coming to consensus on what contributions were valuable in a retroactively-updating and non-transactional way.

We’ve been working on this tool because crypto / web3 projects need it – the open source ecosystem needs it – and no-one else is building feasible solutions. People have mostly focused on bounty or task based systems, where you either put an upfront price on every task, or pay a fixed amount for every task based on # of hours worked or such. But these systems can’t scale, and don’t reward long-term alignment. They quickly devolve into people focusing on short-term arguments about how to split a bounty, or how many hours of work something “really” took, rather than focusing on long-term results and long-term alignment.

The key feature of SourceCred is that you don’t need to worry about the short-term rewards. In the long run, if your work is valuable, people will have to notice because it will be so good they can’t ignore it. And then people will link to and depend on your work, and the grain harvest strategies pay ppl who were historically under-valued first, so you’ll be made whole.
Because of this, it’s actually possible for everyone on the project to focus on doing valuable things first-and-foremost, and trust that as the system gets better and better, they’ll be rewarded more accurately.

However… it’s still just a tool. It won’t solve your problems for you, but you can use it to solve your problems. And it’s beta-quality tool that has a lot of known issues. The reason it’s working for SourceCred in the CredSperiment is because we are all working together to keep the cred quality high, because we are willing to accept changes and volatility while we prototype it. If you start using it in the near future, you should expect some of the same volatility yourselves.

Also, I recommend taking a look at the SourceCred Beta Partner Program. It’s basically a program for early adopters where the SourceCred team agrees to actively support a community, help work through thorny issues, etc. Our communities are already pretty close so I think it could be a good fit.

By the way, for anyone who wants to learn more about SourceCred and prefers videos to walls of text, I recommend this talk I gave at ETH Denver.

6 Likes

Great post and great questions @peth, definitely agree that we should put as much rigour into these decisions as possible, and this thread will hopefully serve as a good resource for all the players to understand the systems we are building better.

My thoughts:

  1. Just because something created a lot of emotional reaction, doesn’t mean it created a lot of value. … You might be pissed when someone’s memes start creating more XP than your well thought out content or code.

This is actually something that SourceCred is designed to mitigate. The value of any given activity is not determined by that activity alone, but how “connected” it is in the graph. Just because something has less likes, doesn’t mean it will earn less XP.

Something like a random meme might get a lot of likes, but it wont really be well connected to the rest of the activity in the graph. If someone makes some well thought-out content and that content plays an important role and get referenced in the future, or inspires others to discuss it more or reference it in the future (i.e. we add it to the wiki or something), then it will actually continue to earn XP from everything it’s connected to. A meme might earn more XP on its “own”, but it wont be earning as much “future XP”.

At the same time, memes are powerful, so their impact shouldn’t be completely discredited either since it brings exposure and spreads the message. They will just earn their XP in different ways for different reasons, and its up to us to decide how much we want to value memes over other stuff.

Just like in video-games, you have to “balance the meta” and make sure one thing isn’t too overpowered. We can decide the weights for different types of activity and tune it to our preferences. Its all under our control. There is no “objective truth” for how much memes are worth vs long thoughtful posts, and the answer might differ depending on the community using it. If we notice that somethings are getting more XP than we think they deserve, then we can adjust the “weights” to balance the meta of the game.

  1. Just because nobody is building on what you committed to github, doesn’t mean it’s not valuable. Maybe you wrote it so well it never has to be revised.

People don’t have to be building on the code that you wrote directly. Any contributions to a repo will be connected to the previous contributions. Additionally, adding more code isn’t the only that future XP can flow to a past contribution. For example, for Intercon, we can say that interspace was a dependency of Intercon, and even if no one was building on Kay’s work, he earned more XP from it because it was being used or referenced.

If you write some code and that thing never sees any new activity, and never gets used or enables things to happen in the future, then it likely wasn’t that valuable. I can’t think of an example of software that would be written once and never touched again, but would continue to deliver long term value without being directly referenced or used in the future.

Same goes for referencing things on Discord. Just because you’re replying to someone doesn’t mean they said or did something useful.

Yes, this is true, The discord plugin was built over one weekend, and we just made our assumptions for what we think might convey that something was valuable. Having just “likes” wasn’t enough because I noticed that there was a ton of valuable discussion happening that didnt get likes but had lots of engagement from users. At that time, we didn’t think of the situation where something negative or distracting was causing people to mention others.

Now that we have seen that happen, it has come to light that maybe its not the best metric to have. But that’s the beauty of SourceCred! We can decide to remove it or change how it works to make it more accurate for our use case, and it will automatically re-calculate the historical XP, making it as if we never made that bad decision in the first place. This is incredibly powerful since it lets us experiment and see what works in practice instead of trying to design a perfect system off the bat (which is impossible). This is also the reason we want to have “vesting” for SEED and not pay it out all at once, it gives us room to tweak the system while ensuring that in the long term, as we make the system better, the people who are most deserving of SEED will get their fair share.

Since this post is getting super long, will split up the rest of my thoughts for other questions in separate comments to allow people to reply to specific parts and break up the thoughts a little.

2 Likes
  1. You say it is not a black box system, but as far as I know there is no way for me to see how the people that have Seeds got those Seeds. I can check the weights but that is not a proof of anything else, because the weights can be correct and cred still wrongly attributed.

This is true, currently there isn’t a proper visualization for non-technical users to see how and when XP was earned. Even for technical users, better data-visualization of what is happening and why will be very important for us to improve and optimize the system to work how we want it. The data is there, we just need to make it more accessible, and this is definitely a big priority with a re-write of the algorithm to make things a lot nicer to interpret: https://discourse.sourcecred.io/t/credrank-scalable-interpretable-flexible-attribution/654

  1. One issue that I haven’t heard properly addressed is that as the price of Seeds grows through the bonding curve, XP should be redeemable for less & less Seed.
    As we get more and more people engaging on Discord and Discourse, there will be more reactions to things, creating, for the latest content, even more XP than the early contributors earned.

I’ve definitely been thinking about this more, and you are 100% right. Setting a “fixed ratio” of X seed per XP is not a good idea. @decentralion and I were talking about this today, and we actually removed the option to even use that as a “payout strategy”. A much better way to do it would be to mint X amount of SEED per week and distribute it based on the proportion of XP, so the absolute amount of XP doesn’t matter. We can decide to change how much SEED per week we want to mint in the future as we scale/grow. The SEED per week might be going up still since there’s more players earning more XP, but the SEED per XP ratio is what would go down.

Theoretically we could just reduce the weights from reactions as we scale, but as sourcecred recalculates everything retrospectively, it would make no difference.

This wouldn’t be necessary if we modulate how many SEEDs we mint like I described above, but the point about SourceCred recalculating everything retroactively is valid because there’s many situations where we actually DONT want that to happen. We want it to happen when we want to fix past mistakes / flaws, but there definitely needs to be a way for us to “shift our priorities” and change the weights for future activities without affecting the past XP. This is also something SourceCred plans to address in the future, and the new CredRank algorithm will help to make this possible.

  1. Looks like a blanket solution. Pull requests on the wiki should probably be awarded less than the ones on Interspace, especially as the wiki content will mainly be built on forums. But you can assign different values to different repos?
  2. Same question about channels on the Discord and categories on Discourse, posts in different ones can be made to weigh differently, right?

Yup, we can configure the weights for any repo, any type of activity, any channel, any category, etc. Can be as granular or high level as you want (e.g. configuring weights for specific emojis in a specific channel, or configuring weights for all activity on GitHub vs discord vs forum). Definitely not a blanket solution!

In addition, with Boosting, you don’t actually have to configure the weights or reach consensus with everyone for what the weights should be. If you think something is being undervalued, you can boost it yourself, and if you were right then you will earn back your XP and possibly more depending on how undervalued it was.

e.g. If you think that interspace is undervalued vs wiki and others disagree, you can just boost interspace yourself, then start pushing people to making more contributions on Interspace (with the fact that they earn more XP, but you can also go talk to people and tell them what to do etc), then you will earn a portion of their XP if you get them to do stuff, without actually having to do any work on Interspace yourself! This is super powerful since it creates incentives for people to directly benefit from “project management” work and helping to guide the community in the right direction, with skin in the game, and without needing everyone’s approval to increase the reward for something. This allows us to scale past Dunbars number since we don’t have to coordinate with each other for these decisions, and instead incentivized people to step up to the plate and make sure things keep moving.

Answers to question 7 & 8 coming soon, need a break now lol

3 Likes

I wanted to also follow up with software design:

Revision and change is not the sign of bad or faulty code. And if code never changes it does not get better like wine, it ages like COBOL. The less people interact and use or change code the further out of current practices and process it falls. Core pieces will not change as much but value of code is not based on update frequency which you touched on.

I think one of the main advantages is the credit flow and our role as approvers. When doing code review and someone does it reject 9 pull requests and add a comment to add the full code as a single pr.

One think I really liked about sourcecred is the level of configurability and how easily it is to change them. I dont know exactly where but you should be able to edit it through the weight configuration button.
image

I think another big idea is allowing people to flow cred to people that didn’t receive the credit when they should have.

3 Likes

I also would like to address the credit recalculation and black box nature…

The ability to recalculate or completely change this stuff on the fly requires a ton of trust. Especially at the time of XP to SEED minting. Until we establish our code of conduct and coding practices this graph and weights are fluid. At some point we need to lock the rules to be trustlessly assure we are on on the same playing field.

I think the XP to batch Seed minting is a genius idea and takes a lot of the leg work out of it and lets us visualize the next SEED cycle because you can see the graph leading up to the end of the month or 2 weeks.

What I can tell you is SourceCred does a ton of stuff correct and is well maintained. It may not be 100% perfect for us out of the box but luckily it is open source and well maintained! @decentralion has been pretty awesome in my interactions with them.

A bot that can watch github, discord, discourse and make pretty graphs sound like huge wins to me with almost no drawback or cost. It empowers us and gives us tools but I don’t think it ties us. For the most part it works unnoticed in the background (which is also a big win imo)

2 Likes

So the weight configuration you see on the website doesn’t actually save the changes or modify the underlying weights set in the config file on github, it just lets you play around with different parameters and see how the distributions change. That way you can see the effect that different weights will have and tune them until you are happy, then go update the configuration in GitHub with your new changes.

If you want to actually change the weights, then you have to make a pull request and get it approved. No one can change the weights silently, its all in the git history.

The project configuration is done in this file: https://github.com/MetaFam/TheSource/blob/master/project.json

The weights are configured in this file:

You can see the entire history is the weights there and any changes that were made. Note that these are just overrides for the default weights in sourcecred, and with the new version of sourcecred we will be updating these as well to put them in sync

All the calculations are run on GitHub actions using this file: https://github.com/MetaFam/TheSource/blob/master/.github/workflows/generate-cred.yml

And you can see the entire history of every time the calculation was run by going to the actions tab: https://github.com/MetaFam/TheSource/actions

Currently we run it every 6 hours.

3 Likes

Great thank you for the clarification @METADREAMER <3 That makes a lot of sense and makes me feel more secure already

@peth, you raise some valid concerns. @decentralion, @METADREAMER and @c0mput3rxz have covered most of the ground I would have, but wanted to chime in with a couple thoughts.

While MetaGame is certainly investing into its SourceCred integration, I would argue that the overall MetaGame architecture is still quite modular. I may be mistaken, but MetaGame appears to be implementing a flavor of AraCred (Aragon + SourceCred). In AraCred’s recently launched docs, the section on tokenomics specifically calls out three composable components:

  • Off-chain contributor scores: SourceCred
  • Bridge: the + of AraCred
  • On-chain tokens: the world of Ethereum

From the AraCred vision section:

“While AraCred = Aragon + Sourcecred, the most important part is not Aragon or SourceCred: it’s the +. It’s the bridge that combines them to create something greater than their parts. We intend to build this bridge so that you can swap out the contrbution tracking algorithm from SourceCred to any other mechanism. You can also swap out any type of DAO design that you want.”

Tools that allow collaborative editing are much better for some things, for sure. SourceCred has used Google Docs a few times. This does sometimes create more work, as you need to put that doc into the graph somehow, using the Initiatives plugin or another plugin. So far, this hasn’t been an issue for us. We’ve just included any Google Docs as a part of a larger Initiative that flowed cred appropriately. For instance, we collaborated with MakerDAO on a development grant in Google Docs (tentative green light btw, which should create learnings for MetaGame:). This was credited in the graph in our first batch of Initiatives.

While I think this method is working well for SourceCred so far, it is definitely an experimental work in progress, and not necessary for MetaGame. Setting any high-level weights is basically a decentralized governance problem. Many ways to skin that cat.

2 Likes

Appreciate the healthy skepticism @david. As a SourceCred contributor, I’m biased, but will say that I became excited about SourceCred when I ran it on repos in another blockchain project I work in, and the scores better reflected reality than the political process of distributing rewards.

There are some big, unsolved issues being tackled here. But I would argue that any DAO has these issues. At the end of the day, if a project is paying for labor, it does need to put numbers on things either way. If you don’t have a reputation system, or some form of metrics, you end up with the tyrany of the structurelessness (see hidden hierarchies, politics and inequality). If you want permissionlessness and pseudonumity (necessary for any really interesting game), there are some serious governance issues that no DAO has really solved. In researching an article I wrote on the problem, I realized it’s pretty grim actually. I think SourceCred has a good shot at solving these problems, but has not yet been tested at scale. What makes me hopeful, is that it’s been working well for the SourceCred community so far (with real money on the line), and we’re getting good feedback from other projects so far.

I would suggest a third possible option:

c) this is big, but proven only in certain smaller contexts. Scaling past Dunbar’s number will require more work, failures, and iteration. It will be messy, and success is not guaranteed.

We are acutely aware of the dystopian possibilities. However, any system that has the power to meaningfully change the world for the better, always has the possibility to change it for the worse. In the podcast below, @decentralion gives a good high-level introduction to the philosophy behind SourceCred. They go into the Black Mirror episode with the dystopian reputation system actually, and talk about how SourceCred is designed to avoid the dystopian outcomes.

Totally agree. I think this is all in the spirit of the game :slight_smile:

2 Likes

Had not thought of this, but it’s a good point! I do think SourceCred sometimes inappropriately values PRs in GitHub. But something I’ve noticed on repos I’m knowledgable about, is that they tend to converge on reality over time. I think this is partly because of the dynamics of OSS development. Sure, someone could spam a maintainer with crap PRs, but eventually the maintainer stops responding to those PRs. Long-term interactions over time tend to signal real value. It will be interesting to see if the financial incentives change this dynamic, but my guess is not materially.

1 Like

Hey @s_ben ! Thank you for a very informative response… I will get to studying included links in more detail over next days (or weeks). Reputation systems are indeed hard but you helped remind me that experiments in this field are very neccessary…and that SourceCred is one viable contender which may offer something substantial in the future but it cannot be known yet - that’s the point of experimenting. For now I wish you all a lot of great ideas, working implementations and a lot of luck with the process of concluding what works and what doesn’t.

1 Like

Thanks for the input @david! Without people willing to speak their minds about the issues they see, we won’t get anywhere. Always happy to hear critiques!

I can’t think of an example of software that would be written once and never touched again, but would continue to deliver long term value without being directly referenced or used in the future. - @METADREAMER

Doesn’t have to be code, it can be a wiki entry.
How does it pick references across platforms?
Eg. if someone wrote something on github, but you’re referencing it in a forum post.

Also, since wiki entries will mostly just be formatting of the things that were already written and rewarded through the forums, maybe they should be worth less?

Now that we have seen that happen, it has come to light that maybe its not the best metric to have. But that’s the beauty of SourceCred! We can decide to remove it or change how it works to make it more accurate for our use case

Maybe it would make sense to keep it, but make it not count it as XP if the OP comment was reacted to with a thumb down?

While MetaGame is certainly investing into its SourceCred integration, I would argue that the overall MetaGame architecture is still quite modular. - @s_ben

Yes & no. We can’t go back to the old accounting system because nobody was tracking their contributions over the past 2 months while we were relying on sourcecred, and with any other new system it would again be a nightmare to calculate the old contributions into new weights. We also still have to deal with recalculating contributions from before sourcecred into sourcecred too :joy:

Anyway, it doesn’t seem like we’ll need to detach either way :slight_smile:

1 Like

All the links and references work across platforms, each “plugin” has a way to define “resolvers” for any links / URLs

Also, since wiki entries will mostly just be formatting of the things that were already written and rewarded through the forums, maybe they should be worth less?

How do we know they were rewarded appropriately? If only one or two people liked it, thats not a lot of reward for the work done. Also, “worth less” than what? Its all relative, we first need to see how things are working out and if we notice something is wrong we can adjust after the fact. Just let it happen naturally instead of trying to predict and control how people will be earning XP because that all depends on the behaviour of people, not something we do beforehand.

Formatting and cleaning up the content and figuring out what section to put it in and how it should be structured is a lot of work in and of itself, dont think it should be understated, because right now we have a shit ton of content in forums but not much in Wiki, so people need to be incentivized to actually put it there. We start with the default settings, and then adjust if its too unfair.

Maybe it would make sense to keep it, but make it not count it as XP if the OP comment was reacted to with a thumb down?

It only counts XP for certain emoji reactions that are “whitelisted” as positive reactions. Currently its these ones: image

We can add more or adjust the weights if we want certain ones to have more significance, this was just a baseline to start from.