Author Topic: Request for crowdsourcing help: procgen market item names (huge update 4/10!).  (Read 12586 times)

Offline x4000

  • Chris Park, Arcen Games Founder and Lead Designer
  • Arcen Staff
  • Zenith Council Member Mark III
  • *****
  • Posts: 31,358
Hey guys!

So... this is a pretty huge favor to ask, sorry about this.  But if any of you have any time you'd be willing to donate, I'd be much appreciated.  And the market in the game will be that much cooler for it.

Background
The tech tree in this game is a hand-designed affair, and is ONLY used by the player races.  Your race is so far behind everyone else that nobody else needs what you're working on.

BUT!  There's a whole other section of the game called the Market, and it works kind of like a procedurally-generated tech tree.  It's more randomized than that, but you can affect the outcome of what you're "unlocking" in various ways.  And you can then trade those things away even if you don't find them really useful for yourself.  And you can trade for things that other races invent as time goes on.

In order to really make this shine, that involves having each market item come up with a cool procedurally-generated name.  I have the framework for that in place, and as of the 4/9 update it's going pretty swimmingly (thanks in huge part to your help).

The Krolin wrote a hilariously-titled piece of Dark Philosphophy entitled: "Why Do The Fatally Crabby Die?"  Seeing that show up in my inventory just made me laugh out loud.  And there were a bunch of other good ones, too.  What a difference this makes to the feel of the market!

The Task
Okay!  After much work by many talented people, we have arrived at the (current as of now) attached spreadsheet.  The actual online spreadsheet has already evolved well beyond this, naturally, even as I write this. ;)  Here's how this works:

1. There are a variety of categories of market item types, and it looks for a column with the header matching that exact name.  The types are:
    Broadcast_Comedy
    Broadcast_Sports
    Broadcast_News
    Broadcast_Education
    Broadcast_Violent
    Broadcast_Romance
    ConsumerProduct_Hygiene
    ConsumerProduct_Clothing
    ConsumerProduct_DangerousToys
    ConsumerProduct_Technological
    ConsumerProduct_Gadgets
    Military_Components
    Military_PersonalWeapons
    Military_VehicleWeapons
    Military_PersonalArmor
    Military_VehicleArmor
    Military_PersonalBioAugments
    Military_RangedWeaponAugments
    Military_BannedWeapons
    Philosophy_Serious
    Philosophy_Dark
    Philosophy_Trivial
    Philosophy_Social
    Fiction_Uplifting
    Fiction_Tragic
    Fiction_Propaganda
    NonFiction_TortureResults
    NonFiction_MilitaryTactics
    NonFiction_Politics
    NonFiction_Science
    NonFiction_Biology
    NonFiction_Technology
    Periodical_News
    Periodical_Research
    Periodical_Propaganda
    Periodical_Entertainment
    Poetry_Humor
    Poetry_Crude
    Poetry_Uplifting
    Poetry_Romantic
    Poetry_EpicHeroism
    Poetry_Angsty

2. Once it has found the appropriate column for the type of "item" being generated, it then pulls a random row from that column.

3. It then looks into that column and finds any tags (anything that starts with { and ends with } is a tag), and does a couple of things with them.

a. Special tag: {n} -- in this case, it places a number between 0 and 9.  I'd advise staying away from numbers and mark levels and such too much, though, because if you put that in there players will assume it has meaning when it does not.  If you want to say "A Collection of {n} Poems," that's a good use for this tag.

aa. Special tag: {n1} -- in this case, it places a number between 1 and 9.  Same notes as above.  Note that you can chain these together to make interesting longer numbers.

ab. Special tag: {nn} -- in this case, it places a number between 1 and 99.  A bit more flexibility.

ac. Special tag: {nnn} -- in this case, it places a number between 1 and 999.  A bit more flexibility.

ad. Special tag: {nnnn} -- in this case, it places a number between 1 and 9999.  Yet more flexibility.

b. Special tag: {RacialAdjective} -- In this case, it will look to the column for the race that is inventing this item.  So for the Skylaxians, this is the SkylaxiansAdjectives column, for the Andors it's the AndorsAdjectives column.  It then pulls a random row from the appropriate column.

c. Special tag: {RacialPlace} -- In this case, it will look to the column for the race that is inventing this item.  So for the Skylaxians, this is the SkylaxiansPlaceTypes column, for the Andors it's the AndorsPlaceTypes column.  It then pulls a random row from the appropriate column.

d. Special tag: {OurRacePlural} -- This is the name of the race that made it.  So if it's an acutian thing, this will always say "Acutians".  If it's peltian, then it will say "Peltians" and so on.

e. Special tag: {OurRaceSingular} -- Same deal as above, except that this is singular.  So if it's an acutian thing, it says "Acutian".

f. Special tag: {OurRandomRaceLeader} -- Each race has three possible leaders, of which only one is in the game at a given time.  However, each is a good source of racial names, right?  So this picks one of the three at random.

g. Special tag: {OtherRacePlural} -- This is the name of any other race EXCEPT the race that made it.  And it's the plural version, as you might guess.  Note that multiple calls to "OtherRace<Whatever>" in an item name will NOT return the same race each time.  It may or may not.  Best bet is to only use one of the "OtherRace<Whatever>" tags in a given name.

h. Special tag: {OtherRaceSingular} -- Same deal as above, except that this is singular.  Once again, any race EXCEPT ours.  And no consistency.

i. Special tag: {OtherRandomRaceLeader} -- Each race has three possible leaders, of which only one is in the game at a given time.  This picks the name of any leader that is NOT from our race.  Again no consistency.

j. All other tags are considered custom, and will look for a column in the spreadsheet with that same name.  So if you want to have a tag named {HiThereBob}, you would need a column with the header HiThereBob as well.  These are case-sensitive, so your capitalization matters.

5. The generator runs through this recursively until there are no more tags.  So if there are tags within tags, that is okay.  Also, if you have multiple copies of {n} or {GenericAdverb} or whatever in one name, it will pick a unique name for each instance of that tag.

Important rules of thumb:
I. If you wouldn't mind having everything in proper capitalization, that would be awesome.  It's super easy to do by just going here: http://www.textfixer.com/tools/capitalize-sentences.php  Choose "Capitalize Every Word" and paste in your entries, and bam.

II. The inventor race is something I'd prefer be apparent.  So pretty much everything should use {Adjective} or {Place} in there somewhere.  If we wind up needing to make other race-specific categories, then fine.  And I guess there can be exceptions to this.  But it's pretty nice when the race makes their own mark on things.

III. As mentioned before, the capitalization of the tags matters.  The tag name and capitalization must exactly match the header of the column to pull from.  Just re-emphasizing that. ;)

IV. Overall not having names that are ridiculously super-long is a good thing, as is not having things that seem to indicate a specific function when really the name has nothing to do with the function.  Aka "Amulet of Power Reduction" would be incredibly misleading since it might have nothing to do with power reduction.

V. If there is a gap of any rows in a column, there is a good chance it might be skipped.  It's not a certain thing, but once the spreadsheet reaches a row where no columns have any content -- even if more rows down south of that row have content in at least one column -- then it stops.  So leaving gaps is something to avoid when possible.

-- Most importantly, here's the actual sheet where data is to be entered at this point: https://docs.google.com/spreadsheets/d/1ngsCKxrQUVM5JRxN5eN-ZnSbWaPvw9G5BKcbFARhVRk/edit#gid=0

Thank you very much for any help!
« Last Edit: April 10, 2015, 09:09:52 AM by x4000 »
Have ideas or bug reports for one of our games?  Mantis for Suggestions and Bug Reports. Thanks for helping to make our games better!

Offline Fabulous

  • Newbie
  • *
  • Posts: 2
I'm not a very creative fellow, so I'm not sure I can help much. But you could add "{0} Biography" on non-fiction and I guess that would be nice.

Offline x4000

  • Chris Park, Arcen Games Founder and Lead Designer
  • Arcen Staff
  • Zenith Council Member Mark III
  • *****
  • Posts: 31,358
Done, thanks!
Have ideas or bug reports for one of our games?  Mantis for Suggestions and Bug Reports. Thanks for helping to make our games better!

Offline topper

  • Sr. Member
  • ****
  • Posts: 307
BUT!  There's a whole other section of the game called the Market, and it works kind of like a procedurally-generated tech tree.  It's more randomized than that, but you can affect the outcome of what you're "unlocking" in various ways.  And you can then trade those things away even if you don't find them really useful for yourself.  And you can trade for things that other races invent as time goes on.
--snip--
In order to really make this shine, that involves having each market item come up with a cool procedurally-generated name.  I have the framework for that in place, but coming up with content for those is... quite a tough thing.
I love the idea of procedural item (and item descriptions?). It works really well in DF at least, which I know you guys are familiar with.

But one thing that can be a turnoff to enjoyment is if there are so many different names that they lose meaning and understanding. For example, what is the significance of "The Brinkley Review" vs "The Brooke Review" vs "The Brooke Session"? Brinkley vs Brooke is nice variation. Review vs Session makes me need to read a tooltip to know if there is an actual difference.  Meaning: Having lots of names like in the google doc is good, but reducing the number of structures like what you have shown above would be clearer to me gameplay-wise. e.g. only 2 types of broadcast name structures but lots of modifiers and/or adjectives. For example:  The {adjective} {name} broadcast on {noun} structure for all broadcasts.

For an example from DF wiki:
Stukos Cilobazin, Miner has created Govoskatdir Neth Tilesh, a bituminous coal millstone.
This tells exactly what the object is and what it is made of as well as having the interesting procedural name.

Or another example that would be for something art-related:
Tabaralen, "The Faithful Moth" This is a giant cave spider chitin shield. All craftsdwarfship is of the highest quality. Is is decorated with turtle shell and goblin bone. This object menaces with spikes of giant cave spider chitin, Native gold and Tower-cap. On the item is an image of two shields in Phyllite. On the item is an image of Tirist Leaderhammer the dwarf and dwarves in turtle shell. Tirist Leaderhammer is surrounded by the dwarves. The artwork relates to the ascension of the dwarf Tirist Leaderhammer to leadership of The Boats of Swallowing in 98. On the item is an image of Mosus Autumnstockade the dwarf in Phyllite
This has the interesting procedural information while still indicating what the thing is in the first couple words.

You know you want menacing spikes on things  :P

Quote
1. I have sccuddly hugd a variety of data that I'm using right now, but there's a lot of junk that got sccuddly hugd in there as well.  As well as uninteresting things that will seem just very odd when made into market items.  So those need to be taken out of the lists.
The forum text mods are a gift that keeps on giving.  :)

Offline x4000

  • Chris Park, Arcen Games Founder and Lead Designer
  • Arcen Staff
  • Zenith Council Member Mark III
  • *****
  • Posts: 31,358
The forum text mods are just too fun to even try to fix, though. ;)

It's a good point on the procedural names, but at the moment there are 54 different possible effects that each item has, and each item has two effects.  With a numeric quality that is split between them anywhere from 90/10 to 50/50.

I don't think that the names of the items can really reflect what the items do, and right now there's no connection whatsoever.  Most likely that will have to be done with little icons.
Have ideas or bug reports for one of our games?  Mantis for Suggestions and Bug Reports. Thanks for helping to make our games better!

Offline wwwhhattt

  • Full Member
  • ***
  • Posts: 124
I'm not sure the philosophy list in its current form will fit with the formats, partly because most of the entries on the list are names while a lot of the formats seem to be better suited for theories - "The Bentham System" doesn't sound quite right, unlike "The Utilitarian System". Maybe if you want them to be about philosophers it should go {0}'s Conviction, The Stance of {0}.

An actual problem with the list is that all the entries are one word only, so names are being split up - e.g. Hannah and Arendt each having their own place. As far as that goes you could just have all the first names deleted. There are also some titles like Ibn or Tzu that should be attached to names, and which can't be missed out from names (having said that, I'm kind of assuming levels of accuracy which shouldn't be expected with the timescale).

If you do want name based philosophy the entries could always come from the General Writing doc, in which case the philosophy doc could be emptied into there.

I think the main difference in flavour is that name centred works sound like the author is concerned with teaching someone else's ideas, so the focus is on the past (which might work well with the theme of re-discovering the human past), while concept centred works sound like they're writing about their own ideas, with a focus on the present.

Format Suggestions:

The Classic of {0}
The Way of {0}
Understanding {0}
The Wisdom of {0}
The Poverty of {0}
A Critique of {0}
On {0}
An Introduction to {0}
Being and {0}
{0}

Those hopefully all work for either names or concepts.

Offline x4000

  • Chris Park, Arcen Games Founder and Lead Designer
  • Arcen Staff
  • Zenith Council Member Mark III
  • *****
  • Posts: 31,358
Cool, I really like those formats!  I've updated it.

And yes, I'd be quite happy to have concepts instead of names.  Or a mix of both.
Have ideas or bug reports for one of our games?  Mantis for Suggestions and Bug Reports. Thanks for helping to make our games better!

Offline TheVampire100

  • Master Member
  • *****
  • Posts: 1,382
  • Ordinary Vampire
For the periodically writings, how about "The {0} Times"?

Offline x4000

  • Chris Park, Arcen Games Founder and Lead Designer
  • Arcen Staff
  • Zenith Council Member Mark III
  • *****
  • Posts: 31,358
Nice!  Added, thanks.
Have ideas or bug reports for one of our games?  Mantis for Suggestions and Bug Reports. Thanks for helping to make our games better!

Offline wwwhhattt

  • Full Member
  • ***
  • Posts: 124
Does the philosophical document have to be made up from one word entries, or can I put in full names and multi-word concepts?

Glad you like the formats, here's more!

Against {0}
Anti-{0}
{0} and {0}        ({0} and {1}?)
The {0} of {0}    (The {0} of {1}?)
{0}'s {0}            ({0}'s {1}?)
In Defence of {0}
In Praise of {0}
The Essence of {0}
{0} Through History
{0}, the Basic Works
The Foundations of {0}
Notes on {0}
The Origins of {0}
Thinking Through {0}
An Enquiry Concerning {0}
A Treatise on {0}
{0} in the Present Age
After {0}
The {0} of the Future
{0} On Trial
The Search After {0}
In Search of {0}
What Has {0} Ever Done For Us?


Edit: was going to delete some because repetition didn't suit them, decided to leave it seeing as I don't know how often they'll turn up. Very not sure about Anti-{0}, sometimes it sounds right and sometimes it doesn't, even with the same word.
« Last Edit: January 30, 2015, 04:02:31 PM by wwwhhattt »

Offline x4000

  • Chris Park, Arcen Games Founder and Lead Designer
  • Arcen Staff
  • Zenith Council Member Mark III
  • *****
  • Posts: 31,358
Nice!  And yeah, they can include {1} for a second name to get pulled into there, if you wish.  If you use {0} more than once, it will just repeat the first name multiple times (which is sometimes useful).  I've updated the list with those!  Thanks again.
Have ideas or bug reports for one of our games?  Mantis for Suggestions and Bug Reports. Thanks for helping to make our games better!

Offline ptarth

  • Arcen Volunteer
  • Hero Member Mark III
  • *****
  • Posts: 1,150
  • I'm probably joking.
How do you feel about putting it into a Google Doc Spreadsheet instead of text document version?

More specifically, with each Format Specification being on its own tab/page, followed by the lists for each variable field, each entry is its own cell, and all of the datasets are in a single document instead spread across multiple browser tabs.
I'd also suggest changing {0}  (which is used multiple times} into {a}, {b}, {c}, etc. that way we know the Consumer Products Name list is {b} instead of {0} which is used for everything. Given the labels can be easily switched back when they are implemented with search & replace, there is no cost to organize it this way. However, there is a benefit to making it easier to distinguish between the lists easily during creation, especially if the crowd sourcing effort is going to work out. Having to say the {0} List for Fiction:General is a bit more awkward than, {d}. I'd do it myself, but then you guys wouldn't have creator privileges and the ability to always have access.
Note: This post contains content that is meant to be whimsical. Any belittlement or trivialization of complex issues is only intended to lighten the mood and does not reflect upon the merit of those positions.

Offline x4000

  • Chris Park, Arcen Games Founder and Lead Designer
  • Arcen Staff
  • Zenith Council Member Mark III
  • *****
  • Posts: 31,358
I'm about to turn into a pumpkin for today, but if you want to create it I'm cool with that -- I'm not worried you're going to run off with the spreadsheet.  And you actually can transfer ownership of a spreadsheet if you want to, too.

Generally speaking I'd suggest just sticking to {0}, since I'm not sure why these would be discussed in the same context anyhow... but not sure how others feel.
Have ideas or bug reports for one of our games?  Mantis for Suggestions and Bug Reports. Thanks for helping to make our games better!

Offline ptarth

  • Arcen Volunteer
  • Hero Member Mark III
  • *****
  • Posts: 1,150
  • I'm probably joking.
So I think topper is on a better route. Determining a linguistic structure for each Market item makes the task more exploitable by the procedural generation techniques. Given how much Arcen uses procedural generation, I won't start preaching to the crowd. I will say that having a casual linguist create these would probably better, but this is the internet, so let's see what enthusiastic ignorance can accomplish first. I'm writing this for two different audiences (People who code and use Procedural Generation and those that don't), so please be patient with my expositions.

So, using a simple procedural structure, how far can we go? Well, here is a list of 100 Media Standard Entries I procedurally generated, without any complex filtering.

Code: [Select]
[1] "Lip This Week"
[1] "Distribution News"
[1] "Reluctantly Detailed Animal Lake Existence Show"
[1] "Tax Exchange"
[1] "Driving Outsiders"
[1] "Last Mine Show"
[1] "Bottle Outside"
[1] "Argument Live"
[1] "Match Exchange"
[1] "Any Ol' Verdant Apparel Lead Sports"
[1] "Wind Exclusive"
[1] "Stamp Talk"
[1] "Downtown Morning"
[1] "Yure Foot Sports Talk"
[1] "Longing Join Accurate Amazing Form Pear Fruit Year"
[1] "Balance Exclusive"
[1] "Black-and-white Cat Beetle Now"
[1] "Authority Exclusive"
[1] "Quince At Midnight"
[1] "Statement Year"
[1] "Pet Report"
[1] "Deer Show"
[1] "Nebulous Dog Week At Midnight"
[1] "Fewscore Caption Minute"
[1] "Tart Brothers Journey Center"
[1] "Peace Sports Talk"
[1] "Automatic Angle Kite Day"
[1] "Seed Report"
[1] "Hulking Boot Graceful Ants Gamy Locket Grade Live"
[1] "Place Sports Talk"
[1] "Table In Depth"
[1] "Thread Business"
[1] "That There Potato Report"
[1] "Smile Exchange"
[1] "Quietly Pricey Drink Gun Old-fashioned Balloon Canvas Hour"
[1] "Fewscore Card Outsiders"
[1] "Enough Quicksand Outsiders"
[1] "Bedroom At Midnight"
[1] "Friend Business"
[1] "Voice In Depth"
[1] "Eyes Talk"
[1] "Society Outsiders"
[1] "Tramp Opinion"
[1] "Usefully Tasteless Throne Distance Jewel Exclusive"
[1] "Son Sports"
[1] "Lively Porter Camera Business"
[1] "More Wealthy Shallow Decorous Holistic Cloud Farmer Lake Hat Bat Bit This Week"
[1] "Great Chairs Expert Talk"
[1] "Reward This Week"
[1] "Toes Opinion"
[1] "Women Second"
[1] "Calendar News"
[1] "Wrist In The Morning"
[1] "Orange Instrument Dust Day"
[1] "Seat Now"
[1] "All Ducks Insiders"
[1] "Legs Insiders"
[1] "Straw Report"
[1] "Angrily Icy Tangy Bucket Imperfect Existence Song Dark Egg Agreement Pipe This Week"
[1] "Comb Sports Talk"
[1] "Tough Heat Report"
[1] "Top And You"
[1] "Office In The Morning"
[1] "Quiet Lace Comfort Exchange"
[1] "Amount Business"
[1] "Owne Turn Day"
[1] "Care Exclusive"
[1] "Adaptable Pet Skate Live"
[1] "Music At Noon"
[1] "Quill And You"
[1] "Push Hour"
[1] "Straw At Noon"
[1] "Sense This Week"
[1] "Organization Sports"
[1] "Quodque Joke Year"
[1] "Frog In Depth"
[1] "Sky Insiders"
[1] "Sufficient Apparatus Money"
[1] "Rest Sports Talk"
[1] "Lip In Depth"
[1] "Some Advertisement Outside"
[1] "Daily Lewd Hope Incandescent Questioningly Eight Religion Clouds Sweater Face Question News"
[1] "Last Linen Sports Talk"
[1] "Too Axiomatic Chicken Scarf Ants Sports"
[1] "Full Cough Fall And You"
[1] "Flavor Minute"
[1] "In Depth Dogs Now"
[1] "Dazzling Talented Tidy Hearing Pump Pen View Business"
[1] "Smoggy Education Hate This Week"
[1] "Zis Zephyr Second"
[1] "Any And All Whole Nation Ice And You"
[1] "Person In Depth"
[1] "Wax Outside"
[1] "Shaggy Mother Page Sports"
[1] "Yak Show"
[1] "Stamp Day"
[1] "Euerie Rings Center"
[1] "Fork Talk"
[1] "Beaucoup Team Outside"
[1] "More And More Grandfather Opinion"

Sure. Some of those are rubbish, 'Quietly Pricey Drink Gun Old-fashioned Balloon Canvas Hour" is a good example.
But most of those are pretty good.

How do we do it?
We break the Market item up into Linguistic Structures and then after defining them, run a program to generate them.

Step 1: Create WordLists for the basic speech parts: noun, verb, adjective, adverb, preposition, determiner, etc  AND a few more specific lists, e.g., Media Suffix, Scientific Names, Technology Names, etc
Step 2: Create more complex PhraseStructures: Noun Phrase, Adjective Phrase, Prepositional Phrases
Step 3: Define the highest level items using Complex PhraseStructures which in turn are defined as items randomly selected from a WordList

My Setup Details
I'll discuss new types of items, but not re-explain already addressed syntax.

{MediaStandard} <- {DeterminerPhrase}{MediSuffix} #def
Here we define the MediaStandard tag as being composed of a DeterminerPhrase followed by a MediaSuffix.

{DeterminerPhrase}  #def
{NounPhrase}
{DetPhrase}{NounPhrase}

Here we define a DeterminerPhrase as EITHER only a NounPhrase or a DetPhrase AND a NounPhrase. Alternatively, we could have defined a MediaStandardPhase as a {DetPhrase}{NounPhrase}{MediaSuffix} or a {NounPhrase}{MediaSuffix}. The probability of each definition being used can be set independently for each case.

{DetPhrase} #def
{Determiner}

Here we define a DetPhrase as a random word from the Determiner word list.

{NounPhrase}  #def
{Noun}
{AdjectivePhrase}{NounPhrase}

{AdjectivePhrase} #def
{Adjective}{NounPhrase}
{Adverb}{AdjectivePhrase}{NounPhrase}

{PrepositionPhrase}#def
{Preposition}{NounPhrase}

##Individual Words
{Proposition}#def
{Noun}#def
{Adjective}#def
{Adverb}#def
{Determiner}#def

And there you have it.

An Example
Let's do an example to help illustrate. At each tag, we search and replace it. Once we get to a single word tag, we leave it alone and go on to the next tab. Once all tags are single words we replace them with randomly selected words from our lists.

MediaStandard is {DeterminerPhrase}{MediSuffix}
We look for a {DeterminerPhrase} and find: {DetPhrase}{NounPhrase}, so we replace {DeterminerPhrase} with {DetPhrase}{NounPhrase}. This was one of two possible outcomes.
We now have: {DetPhrase}{NounPhrase}{MediSuffix}
We look for a {DetPhrase} and find: {Determiner}, so we replace it.
We now have: {Determiner}{NounPhrase}{MediSuffix}
We look for a {NounPhrase} and find: {AdjectivePhrase}{NounPhrase}, so we replace it.  This was one of two possible outcomes.
We now have: {Determiner}{AdjectivePhrase}{NounPhrase}{MediSuffix}.
We look for a {AdjectivePhrase} and find: {Adverb}{AdjectivePhrase}{NounPhrase}, so we replace it.  This was one of two possible outcomes.
We now have: {Determiner}{Adverb}{AdjectivePhrase}{NounPhrase}{NounPhrase}{MediSuffix}.
We look for a {AdjectivePhrase} and find: {Adjective}{NounPhrase}, so we replace it.  This was one of two possible outcomes.
We now have: {Determiner}{Adverb}{Adjective}{NounPhrase}{NounPhrase}{NounPhrase}{MediSuffix}.
We look for a {NounPhrase} and find: {Noun}, so we replace it. This was one of two possible outcomes.
We now have: {Determiner}{Adverb}{Adjective}{Noun}{NounPhrase}{NounPhrase}{MediSuffix}.
We look for a {NounPhrase} and find: {Noun}, so we replace it. This was one of two possible outcomes.
We now have: {Determiner}{Adverb}{Adjective}{Noun}{Noun}{NounPhrase}{MediSuffix}.
We look for a {NounPhrase} and find: {Noun}, so we replace it. This was one of two possible outcomes.
We now have: {Determiner}{Adverb}{Adjective}{Noun}{Noun}{Noun}{MediSuffix}.
We then sample from our lists. "All Acidly Advice Afternoon Air Weekly".

Looking at this, you can already tell we could make it better by reducing the number of times we generate a NounPhrase and replace it with a simple noun. In those cases where we do have long names, it is because the circular adverb-adjective-noun loop recurring. I tweaked it by hand reducing the probabilities of longer phrases, but redefining AdjectivePhrases and NounPhrases would probably work better.

The word lists can either be reused as is for each of the different Market items or Modified to include or exclude items. For example, one can define Military Items with specific Military nouns in addition to regular nouns.  The hard part would then be generating market-item specific lists, for this example I combined the MediaPrefix Items with the Determiner items (e.g., Think, Tough, Unscripted, Up Close with). I also created a unique MediaSuffix items from the list x4000 provided (e.g., Tonight, This Week, Show,  Report, In The Morning, In The Evening, At Noon, At Midnight).

Advantages over what is being done.
  • Simplification of Lists : Currently, looking at the separate word lists on Google Docs, many of items are repeated. By having more generic word lists that are then referred to as needed we reduce repetition.
  • More versatile: "Shaggy Mother Page Sports" and "All Ducks Insiders" are not items that we would have reasonably generated using the original definition of Media, but are easily generated.
  • More combinations: There are many, many possible item combinations. Given that it is recursive, I would have to do a lot of math to calculate how many their are.
  • Less to maintain: Maintenance would be easier (although harder too). x4000's original Media Frames were a list of 41 different items. Here, we technically have one definition for MediaStandard which is: {DeterminerPhrase}{MediSuffix}. Of course you also have the subtag components to define. But most of those will be reused across all Market Items.

Of course, I'm just a guy with too much time on his hands and a soap box.
Attached is the zipped code I used. The code is written in R. Also attached are the wordlists in case someone wants to steal them. They were stolen from Wikipedia and some random webpages. They will work as reasonable proxies until better lists are found.  x4000's lists were too heavily flavored with Countries and other special Proper nouns to make it work well.

Update
I tweaked the code a bit and simplified some Noun Phrases. Probably a bit too much trimming, but the variation is less extreme.
Code: [Select]
[1] "Such Appliance In The Morning"
[1] "Lawyer Morning"
[1] "Defiantly Empty Airplane Outside"
[1] "Severall Beef Outsiders"
[1] "Nil Amuck Purpose Sports"
[1] "Band Minute"
[1] "Cow This Week"
[1] "Scene Sports"
[1] "Any And All Shock Morning"
[1] "Anotha Son And You"
[1] "Club Day"
[1] "Quarter Sports"
[1] "Cord Second"
[1] "Flag In The Morning"
[1] "Interest Sports Talk"
[1] "Kite At Midnight"
[1] "Act Center"
[1] "Every Stone Sports Talk"
[1] "Vest In The Morning"
[1] "Last Operation Exclusive"
[1] "Badly Malicious Birth In The Evening"
[1] "Just Page Talk"
[1] "Board Opinion"
[1] "Support Live"
[1] "Cemetery This Week"
[1] "Veil Talk"
[1] "Robin Opinion"
[1] "Whichever Jar And You"
[1] "Gold Insiders"
[1] "Some Ole Truck Exchange"
[1] "Guide Show"
[1] "That Kiss At Midnight"
[1] "Crack Talk"
[1] "Caption At Midnight"
[1] "Umpteen Lucky Temper Business"
[1] "Coach Report"
[1] "Pleasure Exchange"
[1] "Cream Morning"
[1] "Euerie Boy Report"
[1] "Hour Day"
[1] "Fewer Hose Year"
[1] "Tough Beast Center"
[1] "Magic Sports Talk"
[1] "Broadly Direful Look Day"
[1] "Extremely Beautiful Steam Sports"
[1] "Those Pets Year"
[1] "Lead Day"
[1] "Your Ocean Now"
[1] "Language Now"
[1] "Grandmother At Noon"
[1] "Need Minute"
[1] "Swim Sports Talk"
[1] "Various Floor News"
[1] "Powder Minute"
[1] "Atta Help News"
[1] "Almost Woebegone Deer At Midnight"
[1] "Bath Outside"
[1] "Cast Year"
[1] "Inside Bedroom Talk"
[1] "Girl Day"
[1] "Pigs Outside"
[1] "Spade Outsiders"
[1] "Stranger Live"
[1] "Cup Second"
[1] "Use Money"
[1] "Broadly Tasteful Dog Morning"
[1] "Sand Opinion"
[1] "Any Ol' Tail At Noon"
[1] "Price In The Evening"
[1] "Soup Year"
[1] "Exciting Plough Day"
[1] "Sea Now"
[1] "Foot Business"
[1] "Hair Second"
[1] "Locket This Week"
[1] "Religion Money"
[1] "Surprise Hour"
[1] "Quite A Few Veil News"
[1] "Immediately Lush Finger Talk"
[1] "Hes Harmonious Car And You"
[1] "Much Cry Talk"
[1] "Advice In The Morning"
[1] "Oranges Talk"
[1] "Each Flight Live"
[1] "Condition Business"
[1] "Surprise Opinion"
[1] "Aunt Money"
[1] "Time News"
[1] "Steel In Depth"
[1] "Dust Sports"
[1] "What Boy News"
[1] "Beam And You"
[1] "Wealth Sports Talk"
[1] "Educated Duck Minute"
[1] "Tightfisted Goat Sports Talk"
[1] "Hella Bat In The Evening"
[1] "Place In The Evening"
[1] "Thilk Eggnog Year"
[1] "That Chin Exclusive"
[1] "Some Ole Box This Week"
« Last Edit: January 30, 2015, 09:58:40 PM by ptarth »
Note: This post contains content that is meant to be whimsical. Any belittlement or trivialization of complex issues is only intended to lighten the mood and does not reflect upon the merit of those positions.

Offline ptarth

  • Arcen Volunteer
  • Hero Member Mark III
  • *****
  • Posts: 1,150
  • I'm probably joking.
More in line with the original request.
I took a crack at the military terms sheet using a more automated approach.

1. Removed all numbered references (The F86 was a great plane, but I don't think that's suitable for a different culture/planet/universe).
2. Removed capital letters (Redundant items, e.g., Industrial and industrial)
3. Removed all words with nonalpha characters in them (e.g., Germany/Sweden)

That removed over 1000 items and left 3368 items.

Remaining Problems
Too many countries names. I don't think France is appropriate
Too USA-centric. There are a couple of German and Swedish names in there, but otherwise just the USA.
Too many acronyms. What's an AAVP? More importantly, is that relevant for them (i.e.,  a different culture/planet/universe)?

Suggestion
Parse lists into sublists to capture the origin of military terms and go fill those lists independently.
1. Gods- Norse, Greek, Egyptian, etc
2. Animals - Weasel, Wart Hog, Fire Hawk, Viper, etc
3. Military technology - Gun, Laser, Howitzer, Cannon, etc
4. Create a script to generate Letter and Number combinations - F-22, F-86, M-1, etc
5. Stars Names - Alpha Centauri, Proxmia, Sol,etc
6. In-game Race flavor Names - Thoraxians, Peltians, etc
7. Hi-tech Jargon - Tachyon, Warp, Thermal Fusion,etc
8. Mythical Animals - Sphinx, Phoenix, Behemoth, etc
« Last Edit: January 31, 2015, 02:28:27 AM by ptarth »
Note: This post contains content that is meant to be whimsical. Any belittlement or trivialization of complex issues is only intended to lighten the mood and does not reflect upon the merit of those positions.