Talk:Search odds condensed (results)

From The Shartak Wiki
Jump to navigationJump to search

Search odds results

I don't have opinions one way or the other about Tables versus Lists, but I do have strong opinions about Data: Please stop removing Search Data from the wiki. You appear to have deleted all of the search data without posting aggregate totals. It is helpful to have you collate the data, but only if you return the collated results to the wiki! You can post summarized lists, or post a table, or cut/paste from Excel, or upload an XML or Excel file, or even leave the raw data in its original list form -- As long as the numerical totals stay. The orange/brown pictures are great and give an overall sense of successes, but they don't provide sufficient data for ANOVA or Chi-Square or all the other exciting statistical tests we have planned. I apologize if I missed a Search Odds Results Archive page. Thanks for your consideration. --Tycho44 18:15, 27 April 2006 (BST) Your Search odds condensed (data) page is exactly what I was hoping for. I apologize for my disproportionately rude remarks above. --Tycho44 00:56, 29 April 2006 (BST)

whoa.
  1. data / results
    1. my bad about removing the old data, if you wanted it you only had to look in the history. The wiki (as far as i know) keeps it for quite a while and I commented at the points when the data was taken and collated. Regardless, I have created another page Search odds condensed (data) for you, I really had no idea there was interest in it, instead of getting all bold fonted and pissy about it you could have just asked.
    2. are there any other results data you want me to fix or make public? I generally feel like I am wasting my time if multiple people are doing the same thing, so ask as I may have it already.
  2. as for the more advanced statistics you and whomever else you imply with "we" are planning.. In my opinion, it is hardly possible to gleen anything valuable from even the basic stats I have done with the limited amounts of data. I agree that those things should be done, it was my original plan, but to do it now with the number of degrees of freedom and under 100 data points is in my mind ludicrous. It seems you are volunteering to do it, so I leave it to you then and look forward to your results. I do really hope you can find something meaningful, that would be great.
  3. now, as to playing nicely in the sandbox.. maybe it is just me, but you seem to not understand wiki etiquette
    1. for your infomation, I am automatically reading in the data and collating it (cludged together with excel mind you, there simply hasn't been enough interest to database it and develop it further). So for example, if you want to change the table, it effects things. Does it effect things in a big way? no. Nonetheless, it would be imfho that you should ask me about it, as a simple skim of the page or its history would indicate that I seem to be the one working on it. I would consider this the minimum of: common. fucking. courtesy.
    2. I have left everything I have posted open for discussion. I have asked for input on the table or data analysis neither you nor anyone else has written anything, nor asked for info until today.. if you would like to continue with this project on your own or work out your own stats without my involvment, please feel free to take it over, but please let me know. I didn't set out to do this and will not do this as some kind of pissing contest. The nature of a wiki is that it is an open community effort, where we all can work openly together and we all win. asshat desensitised manners will get you .. well.. asshattedness. *shrug*
-- fitzcarraldo|T 00:09, 28 April 2006 (BST)
My intention was to contribute rather than to damage the work that you've done. I appreciate the work you've put into this, and I'll put future format change proposals here first to avoid breaking automated data reading. I have no interest in pissing contests; I'll see what I can do to add more search data. --Tycho44 00:56, 29 April 2006 (BST)
My meaning is NOT to push anyone away, but to open dialogue; as stated in Talk:Search odds condensed this project could use your and anyone else's help with adding data, and collation. I just don't see the point in multiple people doing the same things, particularly since my time (and I assume yours as well) is limited; working together we can probably take things further and make them better! :D -- fitzcarraldo|T 02:07, 1 May 2006 (BST)
In my opinion, the most valuable outcome would be for everyone to have a complete database of all raw data to date. With a full database, it is pretty quick for someone new to come along and hypothesize, "hey, mangos and bananas have exactly the same find probability in all locations", and run some sort of test. Others who trust that hypothesis could combine banana and mango finds in their data, giving them a narrower estimate of find rate for fruits. "are there any other results data you want me to fix or make public?" sure, if you have additional useful data, that'd be great. --Tycho44 05:35, 16 May 2006 (BST)

future

The more I think about this project with the expansion of the data the more I feel queasy about finding the time for the combinatory and advanced analyses. I honestly did not expect the breadth and depth of changes in game to happen as quickly nor as broadly. With that in mind I would like to pass on some of the responsibility and perhaps restrict the results while maybe employing more statistical work on the front end to gleen the results people want. The goals for collation, which could certainly be misguided, were operating on the premise that most players really just want to know:

  1. where does one go to get the best chance to find a certain thing, and conversely
  2. what chance is there of finding something in a certain place.

Adding a need for users to cross reference class, and skills I think makes for a large and complicated table or tables. Basically, with that in mind, it will be difficult to meet the premise I have set out with at this point and I am open to suggestions. I include a few ideas below. -- fitzcarraldo|T 02:07, 1 May 2006 (BST)

  1. databasing? the amount of data needed to cover the different eventualities and to gain accuracy while more players help out could raise exponentially the amount of data to be collated, which means the system needs to be ultimately quite scaleable. The current system is using excel which has finite limits (250 rows for example), and has trouble with many combinations .. a true database I think would be great! which one? -- fitzcarraldo|T 02:07, 1 May 2006 (BST)
  2. designed experiments (DOE)? We would setup an experment before taking data, then do a limited set of sampling as defined by the method based on the hypotheses. The results would indicate effects and potentially how significant their effects; although this would not necessarily allow one to be able then to pinpoint percentages.. would it be valuable without actual numbers? If so, then we could concentrate the bulk of the data only on the basic statistics; which begs the next question.. -- fitzcarraldo|T 02:07, 1 May 2006 (BST)
  3. what are the basic stats that people really want to know? are there any corners that can be cut? -- fitzcarraldo|T 02:07, 1 May 2006 (BST)
  4. how many samples are required to yield significant results .. mind you, no one has reported having found a heavy sword at this point so it feels we have a long way to go, but I think there must be some strange mechanism that effects its probablity of being found; nonetheless, my gut says 1000 samples would be a good number giving quite repeatable results. -- fitzcarraldo|T 02:07, 1 May 2006 (BST)

Observations: where to get the best chance to find a certain thing?

(A) Generally speaking, the Medical Huts, Ammo Huts, and Shipwreck Treasure Hold are overwhelmingly better for searching than anywhere else in the game, assuming that you're trying to find medicine, weaponry, or coinage. Most generic locations (such as jungle) have a find rate of under 20% to obtain anything at all, and it is typically low-grade stuff. For a specific target (e.g. First Aid Kit), the appropriate resource hut (e.g. Medical Hut) is by far the best place to look (e.g. 77 FAKs in 416 AP, or about 18.5%, for characters without Scavenging). With 77 successes in 416 searches, a true FAK find rate of 14%-22% is 95% certain.

  • There are way more searches taking place in these locations as anywhere else, so it should be much easier to collect reams of data on these specific locations.
  • There is much higher demand for accurate and complete search odds for these specific locations.
  • Specifically, these locations are: Outsider Ammo Huts, Outsider Medical Huts, Native Ammo Huts, Native Medical Huts, Pirate Armory (ammo), Pirate Large Cabin (medical), Pirate Hold (gems/coins).

(B) There is also a demand for Trees. Tree search results are completely independent from jungle searches (e.g. nothing but bananas). Since there is only one item to be found, statistical analysis is very straightforward, and a few hundred searches would be enough to satisfy my curiosity. In general, 385 searches guarantees 95% confidence of experimental probability within +/- 0.05 of true find probability.

(C) Searching open Jungle seems to me a lot like searching outside buildings in Urban Dead. Although you could potentially find most anything, it is a waste of AP compared to searching resource huts.

  • In my opinion, non-resource-hut searches should focus on potentially lucrative "distinctive" areas -- Ruins, Ruined Temples, Caves, and the like. We should develop a consistent abbreviation/coding scheme for locations - or even report the GPS coordinates of the search...
  • Currently, search outcomes in these distinctive areas are extremely disappointing. Perhaps that could even be reported as a Bug. What's the point of mysterious secret ruins in the middle of the jungle if there is nothing significant to be found there?

(D) In my opinion, the biggest problem with determining Search Odds is not the daunting number of independent variables (such as character class, jungle density, etc.), but the fact that many of these search odds are dynamically changing as Simon makes changes to the game code. Examples:

  • New locations (Tunnels, Trading Hut, etc.) keep getting added to the game.
  • It is nearly certain that the Heavy Sword find rate has been dramatically reduced (at least 4 Heavy Swords were found during the first few hundred player-months; currently there are several thousand players but no Heavy Sword found in months).
  • Chopping Jungle now seems to have a 25-35% chance of giving 1 XP instead of <10%... similar changes to find rate could be happening constantly behind-the-scenes.
  • With the addition of Scavenging, search odds for everyone else could easily have been tweaked downward.
  • Now that driftwood has a purpose, the probability of finding driftwood could have been tweaked up.

However, as with Urban Dead (where generators and fuel cans were added, then running generators changed search odds, etc.), changes to the search odds are not prohibitive to data analysis. Anyway, those are some of my thoughts... --Tycho44 20:43, 15 May 2006 (BST)

Also, in my opinion, the search targets should be 100, 200, 400, 800, 1200 searches in a particular terrain type. Information like character class (scientist vs explorer) or exact location (Derby vs Durham) should be logged whenever possible, but we can collapse across it until it becomes apparent that the search rates are distinct. It is already clear from <100 searches that Scavenging has a huge effect on some searches, so characters with Scavenging should be treated separately. If we're lucky, scavenging has a straight-up 50% increase in overall find rate or something simple. --Tycho44 20:51, 15 May 2006 (BST)