Open Registry of Game Information 

  • Contribution process

  • Talk about specific features of our upcoming online game database.
Talk about specific features of our upcoming online game database.

Moderators: MZ per X, gene

 #37648  by Tracy Poff
 28 Nov 2013, 00:58
MZ per X wrote:Here, I disagree.

Static content delivery is cheap, and easily scaled, so I don't see why we should make Oregami dependent on other online ressources, as durable as they might be. Especially when it comes to sources, which are a very important part of our goals of transparency and scientific approach, but not very often downloaded at the same time, there's no good reason to send people somewhere else, I think.
Of course my reasoning isn't that we're going to fill our server up with too many images or whatever. Rather, screenshots of a web page aren't as useful as a (durable) link to the actual webpage, and saving a webpage isn't as simple as just saving the HTML or whatever. People like the Internet Archive have great expertise at archiving web pages, and they're likely to do a better job of it than we are.
MZ per X wrote:While I can see the value in your suggestions, avoiding redundant data being the most important, I'd still vote for the source data being independent from any other data.
On the contrary, I'd argue that much more important than avoiding duplication is making the data more readily available. If a screenshot of credits is valuable as a source, then it's valuable, period, and there's no reason we shouldn't include such a screenshot with others. If a screenshot of the options screen of a game confirms that it supports 16:9 and 4:3 (for 'tech specs' or however we might store that info), I also think that people might just want to see the options screen for themselves--I know it's often the first place I look when starting a game, and there are rarely screenshots available.

Box scans, screenshots, manual scans, whatever. If they were useful as a source for our data, someone might find them useful in and of themselves, and I think we'd do a service by ensuring that they're available wherever someone might expect to find them, rather than 'hidden away', relative to more foregrounded data.

All that said...
MZ per X wrote:This independence will save us from quite some problems, the main problem being turning off people like Jotaro Raido who are only interested in a certain part of the data (credits in his case). The quality of source material for credits contributions doesn't need to be high, only the text needs to be readable, I once even used a camera to shoot Nintendo DS credits, or people use to paste dozens of credits screenshots together into a single one to better handle it. I see the danger that these contributors are then told to provide standard quality scans and shots for introduction into the covers or screenshots section of Oregami, then use the source link facility to save space, which would be counterproductive.
This is an important point. All the good intentions in the world won't save us if they make the process difficult for contributors. I don't let the practical concerns about contributing go far from my mind, I assure you. My thoughts are something like this: if all we have are low-quality screenshots to verify some data, that's unfortunate, but certainly no reason to reject the data. Later, though, someone might take better screenshots, at which point these better screenshots could replace the lower-quality ones as a source to verify the data. And, if those screenshots were of sufficient quality, we might as well store them right along with all other screenshots.
MZ per X wrote:Furthermore, it could happen that covers or screenshots will need to be deleted for whatever reason, leaving us with unsourced data scattered throughout the database. In summary, I don't think that the added complexity of the data model for source links is worth its drawbacks.
That would be a big problem. This is why I suggested that the source information should include some description like "These credits are taken from the staff roll displayed upon completing the game." in addition to the screenshots or video or whatever, so that while the source may be less readily available, we will at least know what it is.

I think this is mostly an operational issue, anyway: we should be careful when deleting screenshots of credits, since they may be used as a source, but that's one of the few times when we'd need to worry about deleting screenshots. Or, maybe we should just never delete screenshots, and instead unlink them from a game and hide them, but leave them available. That's easy enough. Or if screenshots are database objects, we can refuse (programmatically) to delete them while they're linked as sources. But this is all implementation details, and there's no point discussing such things at this early stage, so I'll stop with that.

There's just one more related thing that I want to say, since I think I explained myself poorly a few posts ago:

I envision that the source information that we display should be a 'first class' data field, just like a game's description. It shouldn't be merely the collection of all the notes people stuck on the changes they submitted. For this reason, it doesn't worry me too much if some particular video used as a source should be taken down, or whatever, since, as long as the source is described and not merely linked to, we can always replace the link, and someday we can maybe link to our own screenshots showing that information. The source information is as much a 'living' part of the database as anything else, so I believe we can always improve things in the future. So please take my 'screenshots should link to our own screenshots section' sort of comments as 'ideally, in the future, when we're able.'
 #37650  by jotaroraido
 28 Nov 2013, 02:29
Tracy Poff wrote:This is an important point. All the good intentions in the world won't save us if they make the process difficult for contributors. I don't let the practical concerns about contributing go far from my mind, I assure you. My thoughts are something like this: if all we have are low-quality screenshots to verify some data, that's unfortunate, but certainly no reason to reject the data. Later, though, someone might take better screenshots, at which point these better screenshots could replace the lower-quality ones as a source to verify the data. And, if those screenshots were of sufficient quality, we might as well store them right along with all other screenshots.
The distinction I'm trying to make here is between "good enough to verify data" and "good enough to display as representative of the game". Just a quick flip through my un-archived stuff turns up software watermarks, user video overlays, blurry shots, washed-out shots, crappy digicam shots, highly-artifacted shots, and emulation glitches. Not withstanding that they're from the endings, not one of these images would meet the quality standards for inclusion in MobyGames, and I think we should have comparably high standards for display images except under extreme circumstances. I personally wouldn't want to see screenshots that looked like these if I was interested in learning more about a game. They should be easily accessible, yes, and replaceable with better-quality images should they become available, but I don't personally believe they're fit to be displayed in the main gallery.
Tracy Poff wrote:For this reason, it doesn't worry me too much if some particular video used as a source should be taken down, or whatever, since, as long as the source is described and not merely linked to, we can always replace the link, and someday we can maybe link to our own screenshots showing that information.
On the contrary, I believe that anything we're able to reasonably archive ourselves, we *should* archive ourselves. Relying on Youtube to not take down videos is a dicey proposition at best. On MG, I've seen credits videos get removed while the submission linking to them was still pending. And of course, the submitter didn't include any screenshots, so they had to be rejected as we couldn't verify them...
 #37651  by Tracy Poff
 28 Nov 2013, 03:11
jotaroraido wrote:The distinction I'm trying to make here is between "good enough to verify data" and "good enough to display as representative of the game". Just a quick flip through my un-archived stuff turns up software watermarks, user video overlays, blurry shots, washed-out shots, crappy digicam shots, highly-artifacted shots, and emulation glitches. Not withstanding that they're from the endings, not one of these images would meet the quality standards for inclusion in MobyGames, and I think we should have comparably high standards for display images except under extreme circumstances. I personally wouldn't want to see screenshots that looked like these if I was interested in learning more about a game. They should be easily accessible, yes, and replaceable with better-quality images should they become available, but I don't personally believe they're fit to be displayed in the main gallery.
Eh, I think we're talking past each other. This was just my point: no reason we should reject bad quality images as sources, when that's all we have, but it should be our goal to eventually replace these with higher-quality ones--of sufficient quality that we're willing to display them right along with the other screenshots. Ideally, we'd have sufficiently high-quality images of everything that we could link to our own screenshots (or box scans, or whatever) as verification of everything.
jotaroraido wrote:On the contrary, I believe that anything we're able to reasonably archive ourselves, we *should* archive ourselves. Relying on Youtube to not take down videos is a dicey proposition at best. On MG, I've seen credits videos get removed while the submission linking to them was still pending. And of course, the submitter didn't include any screenshots, so they had to be rejected as we couldn't verify them...
A tragic tale, indeed! I'm not, in principle, against saving those things ourselves, and in the case of a screenshot from a game that's certainly fine. But for something like a web page, a screenshot of it is inferior as a source, compared to an archived copy from the Wayback Machine, so while I'm sympathetic to the goal of ensuring that we have a copy in case 'the worst' happens and the Internet Archive goes down, or something, preferring screenshots to IA links honestly just seems like degrading our service in deference to paranoia, to me.
 #37670  by MZ per X
 01 Dec 2013, 21:58
Tracy Poff wrote:Rather, screenshots of a web page aren't as useful as a (durable) link to the actual webpage, and saving a webpage isn't as simple as just saving the HTML or whatever. People like the Internet Archive have great expertise at archiving web pages, and they're likely to do a better job of it than we are.
This could be solved as easily as saving screenshots and the URL within the source data set. But screenshots are still mandatory for me, as future-proofness is a founding principle of Oregami. While you are right in that the Wayback Machine may stay forever, I'd still like Oregami to be as self-contained as possible. This also means that stills of source videos would need to be taken.
Tracy Poff wrote:Box scans, screenshots, manual scans, whatever. If they were useful as a source for our data, someone might find them useful in and of themselves, and I think we'd do a service by ensuring that they're available wherever someone might expect to find them, rather than 'hidden away', relative to more foregrounded data.
Okay, what about including a "source type" in the source data set? If this type is "screenshot", then we could show it in the screenshots section under an additional view called "Other shots" or something like this.
Tracy Poff wrote:That would be a big problem. This is why I suggested that the source information should include some description like "These credits are taken from the staff roll displayed upon completing the game." in addition to the screenshots or video or whatever, so that while the source may be less readily available, we will at least know what it is.
Another important point pro having separate source data is that it may be much harder, if not impossible, to force us to take it down for copyright reasons. While normal screenshots and box scans are "just there", we really need those source shots for our scientific purpose, which is a bold reason to justify its taking, rendering our sourcing quite immortal. :)

Of course, you could turn that argument around: if we'd link normal screenshots as source, it would also be harder to force us to take those down. Yes, but not as hard as source-only shots, I guess. And combined with that "source type" thing above, we could have the best of both worlds: a barely mortal, and truly independent, sourcing where the special interest contributors would be able to contribute without caring too much about quality (source-and-forget, if you will), and the source materials un-hidden from the public.

Having said all this, we will need a free text field for source data, nonetheless.
Tracy Poff wrote:I envision that the source information that we display should be a 'first class' data field, just like a game's description. It shouldn't be merely the collection of all the notes people stuck on the changes they submitted.
From my notes above, I think you see that we agree on this one. :) A source data set could then contain:

1) Files(s) (shots, scans, text files, whatever)
2) URL(s)
3) Source type
4) Description
Tracy Poff wrote:For this reason, it doesn't worry me too much if some particular video used as a source should be taken down, or whatever, since, as long as the source is described and not merely linked to, we can always replace the link, and someday we can maybe link to our own screenshots showing that information. The source information is as much a 'living' part of the database as anything else, so I believe we can always improve things in the future. So please take my 'screenshots should link to our own screenshots section' sort of comments as 'ideally, in the future, when we're able.'
This I don't really understand. I envisioned that every iteration of certain data has a complete source attached to it, one at first contribution, then an additional one with every change. Is that what you describe with "living part of the database"?
 #37673  by Tracy Poff
 01 Dec 2013, 23:33
MZ per X wrote:This could be solved as easily as saving screenshots and the URL within the source data set. But screenshots are still mandatory for me, as future-proofness is a founding principle of Oregami. While you are right in that the Wayback Machine may stay forever, I'd still like Oregami to be as self-contained as possible. This also means that stills of source videos would need to be taken.
I suppose it'll work out. I still worry about us taking on too many responsibilities that are outside just 'documenting games', since that's a big enough task all on its own, but perhaps it will all be fine.
MZ per X wrote:Okay, what about including a "source type" in the source data set? If this type is "screenshot", then we could show it in the screenshots section under an additional view called "Other shots" or something like this.
We haven't (as far as I know) got any plan for handling media at all yet, so I think it's maybe a little early to go into too much detail. Since our handling of all images, sources or not, is going to rely on that, let's table the discussion of these details, for the moment.
MZ per X wrote:This I don't really understand. I envisioned that every iteration of certain data has a complete source attached to it, one at first contribution, then an additional one with every change. Is that what you describe with "living part of the database"?
Let me give a (simplified, inaccurate, not following our data model) example to show just what I mean:
Code: Select all
Revision 1 (new game being added)
Title: Super Oregami Bros.
Publisher: AwesomeSoft
Source: "Publisher listed on the box (scan-of-box-included-here)."

Revision 2
Title: Super Oregami Bros.
Publisher: AwesomeSoft
Credits: Fred Mobgo (developer), Joe Bloggs (art)
Source: "Publisher listed on the box (scan-of-box-included-here). Credits from the manual (bad-quality-scan-included-here)."

Revision 3
Title: Super Oregami Bros.
Publisher: AwesomeSoft
Credits: Fred Mbogo (developer), Joe Bloggs (art)
Source: "Publisher listed on the box (scan-of-box-included-here). Credits from the manual (bad-quality-scan-included-here)."

Revision 4
Title: Super Oregami Bros.
Publisher: AwesomeSoft
Credits: Fred Mbogo (developer), Joe Bloggs (art)
Source: "Publisher listed on the box (scan-of-box-included-here). Credits from the manual (better-quality-scan-included-here)."
In the first revision, the game was added, and a scan of the box was included as a source for the publisher.

In the second revision, credits were added, and a scan of the manual was provided as a source for them. The source for the publisher is still listed, and the submitter added the source for the credits to the pre-existing source field.

In the third revision, a typo in the credits was fixed. No change to the source field was needed.

In the fourth revision, a better-quality scan of the manual replaced the old, bad scan. No changes were made to the other data.

Of course, in the real database, the credits are likely to be a separate object with their own source field, separate from the game entry, and titles are done differently, too, but my point is that the data and the source of the data may be modified independently, and that each revision contains the full source information for everything contained in it, including things added previously. Naturally, we'll expect that people verify that a 'better scan' is indeed a better scan of the same document, or at least that it also serves as a source for the same information, before we accept a revision that changes an old scan to a new one, but that sort of things is just a matter of policy.
 #37680  by idrougge
 02 Dec 2013, 17:46
Tracy Poff wrote:
Ultyzarus wrote:Publicly viewable, with saved screenshots of those sources.
Rather than archiving screenshots ourselves, I'd say we'd be better off working to have durable links to sources. In the case of web sites, a link to a copy of the site in the Wayback Machine is likely to be at least as durable as Oregami.
Actually, no.
First of all, it is not certain that Archive.org will archive the page in question at the date of submission. Pages will change, and are only sampled periodically, if at all.
Second of all, I have seen pages disappear from Archive.org after introduction of a spiders.txt — which Archive.org respects even retroactively.
Thirdly, Archive.org throws away pictures and embedded sources and has a difficulty handling javascripts and CGI scripts. It will make reading those pages a puzzle.

Not to mention that it is never safe to rely on an outside actor with problems of their own.
Last edited by idrougge on 02 Dec 2013, 18:10, edited 1 time in total.
 #37681  by idrougge
 02 Dec 2013, 17:59
Tracy Poff wrote: My guiding principle was that we should leverage the tools we're already building wherever possible. If screenshots of the game are useful as verification of details about the game, I don't think that it's unreasonable to store them right alongside the other screenshots. Perhaps someone will find them useful who wouldn't have encountered them if they'd been only in the source information section for the credits, or whatever. Similarly, if we're storing box scans anyway, and if a box scan is the source of some data, we might as well simply link to our copy of the box scan, rather than storing another copy.
People who find captures of credits listings useful will find the source images link. People who want to look at screenshots of Bloody Gun VIII will find them absolutely useless. There's a reason Wikipedia has an article view, a discussion view and a history view.
Tracy Poff wrote:An anecdote: for some open source game (Neverball, perhaps?), I contributed information about its release dates, or patches, or something (don't have a copy of my own submission comments, so please forgive my vagueness) which I got by looking through its SVN history to see when the different releases were tagged. It's a pretty straightforward process, but I think that it probably wants a little bit more explanation than just a screenshot of a console window with 'svn log' run in it.

Similarly, I think that your example of using a hex editor to find credits will probably deserve more than just a screenshot of a hex editor, too.
There must be a balance between the effort required by the contributor and the effort required by the researcher using that info. Sooner or later, the effort of running a hex dump is overshadowed by the effort of describing the game ROM checksum, the offsets, the bit rotation and the interpretation used.
Tracy Poff wrote:So, however we deal with sources, we should certainly have at least the ability to write some text about them. In that case, I don't think it would be wrong to write something like: "These credits are taken from the staff roll displayed upon completing the game. A (low-resolution) video of the staff roll is available here."
Do you know how much time and energy it takes to write so many characters? Of course, a free-form text field is a requirement carried over from Mobygames, but writing "This is a red rose in a vase" if you've already attached a screenshot of a red rose in a vase is the perfect way to scare away both newcomers and experienced users. Writing "staff roll, Croatian version" in the description and the source URL in the source field in addition to screenshots which are obviously from a staff roll video is more than enough.
 #37687  by Tracy Poff
 03 Dec 2013, 04:12
idrougge wrote:First of all, it is not certain that Archive.org will archive the page in question at the date of submission. Pages will change, and are only sampled periodically, if at all.
Second of all, I have seen pages disappear from Archive.org after introduction of a spiders.txt — which Archive.org respects even retroactively.
Thirdly, Archive.org throws away pictures and embedded sources and has a difficulty handling javascripts and CGI scripts. It will make reading those pages a puzzle.
Of course someone submitting an Archive.org link for a source should verify that it does display what they want it to. I think that goes without saying. The robots.txt problem is a bigger issue, I agree, and not just for us.
idrougge wrote:People who find captures of credits listings useful will find the source images link. People who want to look at screenshots of Bloody Gun VIII will find them absolutely useless. There's a reason Wikipedia has an article view, a discussion view and a history view.
Well, yes, not all screenshots of credits for all games will be of interest to everyone. But, for example, in Super Smash Bros. you play a bit of a shooting game during the credits, and it's not the only game that does something interesting during that time, so we might well want to have screenshots of some credits anyway.

Naturally, you can respond to this by saying "then we should only have screenshots of those credits that are of particular interest." But I think that's not the best approach. If possible, I'd really like us to have a very complete documentation of a game in screenshots: show off the title screen, all the major features, each level, bosses, and, yes, the credits. It might be a little excessive if we just dumped them all on a page, but with a bit of metadata, I think that it could be made very manageable.
idrougge wrote:There must be a balance between the effort required by the contributor and the effort required by the researcher using that info. Sooner or later, the effort of running a hex dump is overshadowed by the effort of describing the game ROM checksum, the offsets, the bit rotation and the interpretation used.
Sure. It's up to us all to decide where we feel that balance should be. My argument is that we should definitely support people adding extra details, if they feel they're needed.
idrougge wrote:Do you know how much time and energy it takes to write so many characters? Of course, a free-form text field is a requirement carried over from Mobygames, but writing "This is a red rose in a vase" if you've already attached a screenshot of a red rose in a vase is the perfect way to scare away both newcomers and experienced users. Writing "staff roll, Croatian version" in the description and the source URL in the source field in addition to screenshots which are obviously from a staff roll video is more than enough.
Of course, it won't be necessary to write a poem in blank verse describing every screenshot you've used. We should certainly try not to make our requirements too onerous, while still striving for reliability. It shouldn't be required to write too much, but, you know, my full-sentence-punctuation-and-all version was about 140 characters--only as long as a message on Twitter. Are you suggesting that for a database that strives for academic credibility, tweets hold too much detail?

In most cases, something like "staff roll, Croatian version" or "manual scan" or something is probably enough. Sometimes we have to do a little more to get our information. I gave the example of looking through the SVN history for a project for info, and jotaroraido gave the example of using a hex editor to extract credits. I can't speak for anyone else, but when I've put some real effort into tracking down accurate information like that, I feel proud of my work, and I don't think I'd find it too difficult to write a few words describing what I'd done.
 #37714  by Ultyzarus
 06 Dec 2013, 18:22
I think that there should be a difference in the contribution process for Digital releases, the contributor needing to choose between the retail or digital option. I suggest that a "Distributed by" field be mandatory, and in that field would be, rather than a company, a website address.
ie.
https://store.sonyentertainmentnetwork.com/#!/en-gb (marked as UK PlayStation Store)
https://store.sonyentertainmentnetwork.com/#!/en-na (marked as North-American PlayStation Store)
http://store.steampowered.com/ (actually this one would be a company, Valve)

each with corresponding Region of release (UK, North-America, Worldwide, etc.)

This would ensure some kind of consistency in the shown data, removing confusion when choosing the release country, or choosing whether to make a new R or not.
 #37746  by MZ per X
 15 Dec 2013, 20:45
Ultyzarus wrote:This would ensure some kind of consistency in the shown data, removing confusion when choosing the release country, or choosing whether to make a new R or not.
Yeah, we need to be careful with these releases, as we also don't list every store where you can buy a boxed release. Not sure about criteria, though, maybe only the big players should be accepted.
 #37749  by Ultyzarus
 15 Dec 2013, 22:21
MZ per X wrote:
Ultyzarus wrote:This would ensure some kind of consistency in the shown data, removing confusion when choosing the release country, or choosing whether to make a new R or not.
Yeah, we need to be careful with these releases, as we also don't list every store where you can buy a boxed release. Not sure about criteria, though, maybe only the big players should be accepted.
No, that is a suggestion that would only apply for digital releases, since they have a limited number of distributors. Hence the suggestion of separating digital and retail releases in the contribution process (different options).
 #37885  by Ultyzarus
 23 Jan 2014, 21:54
I just thought that we should have an easy process to merge or split games (or RGs). Say that a user adds a new game because it's a new version of a game we already have, but with a different title, he might not know that the game existed before under a different name and we'd end up with two entries.

It should be easy to select a game that it would be merged with (maybe with game ID numbers: "select game to be merged, then select game to merge to"; "Information from Game #65482 will be added to game #653339, are you sure?"), then select the description and merge it with the existing one, or move it to the RG description (whichever applies).

Similarly, the split tool would work as if we were creating a new game entry, except that the existing information would be used as a base for the entry.
 #37896  by MZ per X
 26 Jan 2014, 19:53
Ultyzarus wrote:I just thought that we should have an easy process to merge or split games (or RGs).
Oh, definitely do we need this. Just to transfer the TheLegacy data to our data model, we're gonna need good tools. :)