Please Scroll Down to See Forums Below
napsgear
genezapharmateuticals
domestic-supply
puritysourcelabs
UGL OZ
UGFREAK
napsgeargenezapharmateuticals domestic-supplypuritysourcelabsUGL OZUGFREAK

She tastes like a comet

it is getting a little smarter now (might have to remove its knowledge of sybols such as the double quote and the bracket since it never uses them correctly - other punctuation is does a little better with):
----------------------------------------
He still makes missing enough

That she would electrocute us all

You try at working on the fire.

I don't belong here?
I don't want to belong enough
If you sulk, sometimes
He wants touching me away.

Don't see you to listed, he knows what is broke any real friends, oh no
We do we show him out when I get rid of itself

It we shouting,
she run, run

All the thing
My uncle Bill
My Belisha beacon

Stop whispering, caught in thinkillers

And teach you home sunshine, made men say "I don't want to happen

Where are you.

Been the saying in your teeth a broken

You care?
I don't know what do you care,
could stars me out

Don't leave me and what do you thinking about you, yourself, what do you?
How do yourself
You can be frightened
You care my real things into position,
your dressing it
Can't get the he cheats me away from the thing you care
when the black
Grow my hair, grow
----------------------------------------
 
this one is a 5 char sampling rate. I tried a 4+1+4... and it wasn't too good. This 5 rate still finds itself saying too many exact same things directly from songs instead of just being influenced by them. so perhaps a 4+3+4...
----------------------------------------
The rats and chat
Well of course I'd like to help the furniture
A bump on the head

I'm amazed that feels like spinning plates.
I'm living in a bunker?
Who's in a fantasy world
You're leaving
The crackle of pigskin
The dust and futures
who lock up the walls
Climbing up the walls

It's holding on the kids in half
Cut the kids in a bunker?
Who's in the pot

I wants to your vote.

Where'd you park the car
Clothes are trapdoors
Doors in themselves
and I burn all I can't remember
Why don't you quiet down
Why don't you quiet down.

Sometimes I get to breath of the meaning of life.

In a little row boat

there was nothing to doubt
there was nothing to feel it in your bones
The morning bell
The mouse
Squash his head
Fifteen blows to the lock
And there was nothing to keep forget?
Why can't read

I'm a reasonable man
get off my case
I'm a reasonable man, slow down.

Sing up the furniture
A bump on the basement
And if you thing
My uncle Bill
My Belisha beautiful world
I wish, I will stop at nothing.
Say the ones
When I am Jim Morrison.
Here I'm a reasonable man,
get off
I'll be the one screaming
 
well right now it has all their albums in there. I'm just taking it from the greenplastic lyrics page. I will add the b-sides next.
so it is limited in size as to how large it can grow, by how prolific they are :)

technically you could add in a dictionary of the entire english language, but it wouldn't really help b/c you would lose all contexual stylings.
it is a probability matrix - so it looks and says, okay, when these 4 characters are in a row, then there is X probability that the next 4 characters will be 'abcd' etc etc.

so yes, as the amount of text that it has to sort through grows, then the better it gets at not repeating exact phrases and the better it gets at speaking/writing like the data that it samples from, at least stylistically - and also the slower it gets each run.

I did something similar to this a few times as bots on the boards. it would scan various boards, build up the matrix, and then form comments and post them back.
the problem with that is 1) the grammar on the board sucks ass, so its grammar was bad too, and 2) all the formatting to find the proper text, and all that threw it off a bit.

it is easier if you scan the newgroups and post there instead. also the gutenbrg (sp?) project as well is excellent - the first one I ever tried it on was all of Poe's collected writings. that worked well since there is soooo much out there.

I also thought about trying to do william gibson since at least neuromancer is online.
 
Last edited:
there is another technique to use called a genetic algorithm. in that case, the markov matrix would really be there for scoring purposes.
you would randomly move letters about and you would do that some number of times. those attempts would be your gene pool. you would then kill off the lowest scoring one, and then mate the highest scoring one with one of the other ones.
you then take that offspring and mutate it some given number of times to repeat the cycle.
the scoring is based on the probability matrix and would thereby allow you to grow closer over time to a more perfect randomly generated sentence or phrase.
in that case, you would want an absolutely huge probability matrix based on the language and/or style of text that you are trying to mimic.

the hardest part of either type of algorithm is determining when to stop.
for my radiohead one I just estimate the line count compared to an average of the songs with some random variation... not to mention I think that chunk is likely not as optimal as it should be so there is likely some room for error in there too, but doesn't effect really the outcome of what you see.
 
smallmovesal said:
i see...

i noticed some grammatical errors in the usage of terms like to/too, etc... must've been from whoever posted the lyrics.

this has radiohead lyrics on it... don't know if it has anything you haven't used at this point - i just remembered i have a link to it.

hope it's useful.

http://www.patleck.com/lyrics/radiohead/radiohead.htm

as far as I know between followmearound and greenplastic, they have all the lyrics, so I'm not too worried about that.

as for grammar - it is unlikely that they typed it in wrong.
what you are seeing is part of the statistical anomolies.


for instance, it comes up with the 4 character section that is "I lik" and then it looks up the probability of those which should follow it.
it could see:
"e ch -> 32"
"e st -> 24"
etc etc

so it then does a weighted random on the numbers, so it picks a random one, but it is more likely to pick the one that has the higher number, but it doesn't *always* pick it. and then it takes that chain and looks at the next probability of whatever follows and on and on.

so if something ended in " t" then there are a bunch of different words that could follow it. it could be "tigers" it could be "totally" it could be "to" or it could be "too"

it isn't inherently clued in about grammer - it just goes by statistics.

and, as that word list grows, a side effect is that the proper grammar will get stronger weighting and it will eventually correct such things.
but it isn't, in this case, because the data was wrong to start with.

that said, radiohead songs also have some random shit in there too.

also, these functions all remove the []()" chars b/c it doesn't know how to close them properly.
 
i figured it had to be the program because radiohead has the lyrics in their albums (from what i can recall) because i got what you were saying about it being statistical.
 
smallmovesal said:
can i see an example of what the coding looks like for your matrix?

what did you write it with - perl?

yes, written in perl - mainly b/c it is so easy to write quickly in.
the code doesn't run as fast as code in C, but it takes far less time to write and test.

as for an example of the code - which part?
I can write out the code in "english" (rough pseudo code I guess)and it will likely be easier to understand than the actual code.

it is vaguley like this:
1) read in the data and put it in a string
2) split the string on '' so that each char is one spot in the array
3) iterate over the array and look at 4 positions at a time (x, x+1, x+2,x+3) and then look at the next 4 positions (x+4,x+5,x+6,x+7), concat each set into its own string
4) setup a hash so that the first string is the key, and it points to a sub hash, in which the second value is the key and you increament its count by 1 - that way if this value shows up again, up its count goes
5) when this is done, you have your matrix
6) then to write out a song, randomly pick a starting key from the main hash and make sure that it starts with a capital letter
7) then take that 4 char value and look it up in the hash, pick a randomly weighted value for it, and so and so forth.

you would loop number 7 until you find a point at which you want to stop.
like I said, I count the line breaks ("\n") and once they hit an average point, then I kick out of the loop.
 
Last edited:
Top Bottom