Recently, I've been working with a dataset that contains the public statements of congresspeople from the past 10 years.
The data was scraped by a colleague from Vote Smart's public statements collection.
We have a longer-term project running that involves this dataset, but this past weekend, I wondered what kind of generated text an n-gram model could make of the public statements (hilarious text, no doubt).
Then, I realized that I'd never actually implemented an n-gram language model that could generate text before.
I set to work, and a little while later (plus some tweaks over the next week or so), I made this
The markov model code is general enough to allow the user to choose the length of the n-grams to search for, which is nice. It's also general enough to not need to be run on word data.
An example from Marco Rubio:
significant interests in ensuring that no matter what your parents talk and you are advancing toward democracy and free enterprise wants to survive this is the most exceptional nation in all the government take that for granted those of the doubt i hope i can t forget the rhetoric basically goes is there an issue i am new to legislation i introduced immigration legislation we need a strong and stable alternative to the people in america that are happening as much they cooperate with this that egypt has a credit rating agencies say is at the idea of its citizens in
And another one, because this one had Wiz Khalifa in it and I can't let that go to waste:
institution and i was born with these threats by focusing solely within our borders and in the best in our interest is to promote access for low skilled labor long delays in the senate but let s put a stop to this point i would submit to you period the sooner we accept that but did you not do in the medicare advantage and you will find in legal circles particularly in the system is the story of how prevalent human trafficking and modern day poet his name is wiz khalifa called work hard so that we need to pass the
I'm interested in building technological platforms that leverage what we know about social dynamics to help people live their lives better.
I'm currently working at the Human Dynamics Group at the MIT Media Lab, creating systems that attempt to measure and impact human social and health behaviors.
I've also worked with the Lazer Lab, inferring partisan dynamics from congressional public statements.
You can e-mail me at firstname.lastname@example.org
Send me encrypted messages using my PGP key. (via keybase)