A 20-Step Plan for My Baseball Century Experiment

Back in August, I wrote a post about my Baseball Century Experiment. I haven’t had much of a chance to do actual work on it, but in the months since, I’ve done a lot of reading, particularly on the science of sabermetrics, and I now have a plan, at least, for moving forward with this little experiment. I explained some of my reasons for doing this in my original post, and a few people pointed out that there was already software out there that does this, so why reinvent the wheel? I have a couple of thoughts on this:

The first, and most important reason, for me, is to learn. Sure, there is software out there that can do this, but it exposes only the results. I’m interested in the internal mechanics of how such a piece of software might work. This helps me in 3 different ways:

  1. It allows me to make a deeper exploration of baseball by implementing it as a simulation myself.
  2. It allows me to dive deeper into a development package–in this case, Mathematica–that I want to know better.
  3. It allows me to tinker in ways that I could not do with off-the-shelf software.

Second, I’ve looked at the software that is out there. The top-of-the-line appears to be Out of the Park Baseball. Not only did I look at this, but I bought a copy for my Mac and played around with it a bit. It gets to some of what I am looking to do, but not all of it. I’m not (at the moment) interested in human management in the game. I’m currently more interested in simulating human management through some basic game AI. That is part of the fun for me.

Third, I’m not interested in developing the kind of elaborate interface that OOTP has. My simulation will be entirely text-based. My ideal output and presentation layer would be something akin to WolframAlpha, for baseball, where you could type in some natural language queries and get a boatload of results, charts, graphs, numbers, etc. But at the simplest level, I’m satisfied with producing text-based box scores, play-by-plays, rosters, lineups, standings, etc.

Fourth, I’m not interested in using real players. Part of the point is to think of this as almost an alternate history to baseball. Fictional players, randomly generated, moving through careers based on statistically valid simulations.

The ultimate goal of my initial1 experiment is to be able to simulate 100 continuous seasons of baseball, and then look at the resulting number and see who are the leaders? Did anyone every hit .400 in a season? Did anyone break a 56-game hitting streak? Who is the home run kings and what is the record? Did any pitcher throw a perfect game?

My approach to all of this is starting very simple and layering on more and more complexity. Over the last several weeks, I have drawn up a plan for how I will approach this. It looks something like this:

1. Develop a simple player generator

Since I’m not using real players, I need a way of bootstrapping players. One of the tools I will need to create, therefore, is a player generator. As with all the tools I’ll need to develop, my plan is to start simple and layer on complexity over time. The simple version of the tool will generate names, positions, and some basic stats for the players. My present approach for generating the stats will be to assume a standard bell curve for a statistic and randomize the stats based on a normal distribution. This probabilities of such a distribution would allow for an appropriate relative generation of “average” players to “superstars” and to players who don’t perform so well. Put another way, there would be a lot of values (say, batting averages) that have small deviations from the mean average. There would be very few that are far better or far worse.

Not a perfect solution but it allows me to bootstrap some basic statistics in a fair way without the need to borrow from real player numbers.

2. Develop a simple team generator

The team generator in this instance is a way of picking out the players needed to create a roster of n people, with all of the necessary slots fills (so many pitchers, so many fielding positions, etc.) from the pool of available people. In a more complicated version, the team generator would be a kind of AI scout or GM, looking at what is available and getting the best that it could. But that is way down the line. Right now, I’m simply looking to be able to create teams out of the players generated in #1.

3. Develop a lineup generator

Again, we are talking simple here. In a more complex version, the lineup generator would be part of the manager AI function. For now, I’m looking to produce the best possible lineup with the data available. In its most simple terms, this is likely a fairly simple two-part problem:

  1. Identify a team player for every position.
  2. For each position, sort the players by OBP (on-base percentage) and then choose the best OBP for the given position.

At this point, the pitcher almost doesn’t matter

In future versions, I’ll probably also look at some more advanced sabermetrics statistics, but this is good enough for now.

4. Simulate a match-up using simple BLOOP methodology

BLOOP is a method of simulation that sabermetricians have used quite a bit. It’s fairly simple. It involves calculating the probability of various outcomes based on the hitters stats. A more sophisticated version normalizes these calculations based on the pitcher they are facing and across the league, but for now, I’m keeping things simple.

5. Simulate a baseball game using BLOOP matchups

Since I will be able to generate teams and lineups at this point, and do matchups, the first big milestone of my project is to simulate a game. This is where a lot of the rules of the game are programmed in and where I’ll first start implementing simple decision charts for the fielding AI. These charts are pretty simple: in a given situation, say 1 out, running on 1st, ball hit to second, what do you do? Also, this isn’t a pitch-by-pitch simulation but an at-bat/outcome simulation. So the level of granularity is still pretty high. Again, I’m not looking to simulate reality quite yet, just get the fundamental framework in place.

One key piece of output from the simulated game will be the box score.

6. Simulate a series of n games where stats are carried over from game to game

Once I can simulate one game, the next step would be to simulate a series of n games, where player stats evolve based on the performance in the previous game. At this point, I’m not looking at things like player fatigue, errors, injuries, etc. I’m testing to make sure I can carry forward the statistics and evolve them as they go along. I’m also looking at performance tweaks. I have no idea how much processor time it will take to simulate a game, let alone 50 games. Having a baseline will allow me to begin looking for performance improvements along the way.\

Output from this series will include a box score for each game, player stats for the entire series, and a series “standing.”

 7. Develop a schedule generator

The whole point, after all, is to be able to simulate a season with teams playing one another throughout the season. As with all of the other components, the first iteration will be simple. The schedule will be based on 2 leagues with 3 divisions per league and 5 teams per division for a total of 30 teams. The schedule will include 162 games per team. Future version will allow for a parameterized schedule where the number of teams or games can vary.

Output from this will be a simple, text-based schedule. A further refinement will be a database version of the schedule. (See item 8)

8. Develop a statistics database

The primary output of a simulation is data. That data needs to be stored somewhere, and considering the ultimate granularity and volume I am aiming for, a relational database makes the most sense at present.  Output from the database will be text based reports (box scores) and charts and other diagrams that can be produced with Mathematica.

9. Develop a game engine

The game engine takes the schedule and plays through all of the games in the schedule, capturing all of the data and producing the necessary output.

10. Simulate a full season

The next big milestone is to be able to simulate a full season using all of the tools developed to this point. The output should be at multiple levels of aggregation, from the game-level, to division, and league. Overall bests and worsts should be included in the output.

11. Add pitch-by-pitch to the simulation

Up the level of granularity of a match-up by going to pitch-by-pitch and improving the overall probabilities for outcomes. In addition to capturing what pitch is thrown, the location of balls put into play should also be captured as part of this addition.

12. Develop a stats module

This is a mechanism by which baseball statistics can be pulled out of the data using a common baseball language. The module will allow comparisons across any statistics and any time frame. They will also be functions to identify trends in the data.

12. Add a “broadcaster” mode

Provides a text-based “log” that simulates a broadcaster calling the play-by-play of the game. For a given game, this adds an additional output, the “transcript” of the game.

13. Add “color commentary” mode

Whereas the broadcast mode is a simple calling of the game, the “color” commentary will be an attempt to provide game insights and interesting situational statistics during the run of the game. The output provides an additional level of detail to the transcript.

14. Add an All-Star game and post-season

The All-Star game will be based on the best stats in each position at the time of the game. The post season will initially be set for 3 rounds, division, league and championship. Initially one wild card per league. Later on these settings will be parameterized.

15. Simulate another full season

Another key milestone, simulating a full season with the improve model and richer data output.

16. Implement player career evolution

Player have to age. With age comes experience but also fatigue and performance issues. I want a way to implement a career evolution that will allow simulated player to move through their careers in a somewhat realistic way. There should be a marker at which point a player would be dropped or traded, as well as another when the player themselves would decide to retire.

17. Implement more realistic game effects

Account for field effects. Account for platooning. Better accounting for relief pitching. Better accounting for errors and fielding and base-running strategies. Account for player injuries. Falliable umpires.

18. Implement more realistic schedule

Account for weather, called or postponed games. Possibly account for a travel fatigue factor.

19. Implement a commissioner function

Allows for continuity across the leagues over multiple seasons. Allows for some loose ability to make changes to the rules (widening the strike zone, designated hitter, etc.) These things can change at various points in the season.

20. Simulate n seasons worth of games

Yet another big milestone. In this case, we are started from a bootstrapped set of players and using everything developed thus far, simulating n seasons (10? 20?) of continuous play. Players come and go, teams win championships, everything is recorded, and searchable at the end of the simulation. Make use of the data to make additional refinements to the game play and AI portion of the simulation.

21. Introduce an economy to the game

Assign players “salaries” based on their sabermetrics values. I’m not thinking of initially using dollars but some relative figure, like “average annual wage.” So a rookie ballplayer might start at 7aaw, which would mean 7 times the wage of an average worker in the U.S. This economy would evolve as seasons play out and numbers and values change. Introduce ticket sales and attendance changes based on game play.

22. Introduce a draft, trades, and free agency

So far down the road I really haven’t considered the details.

Obviously, none of this happens overnight. And the higher the number of the item, the less I’ve considered the details. I’m eager, however, to get to step 20, where I can really begin to simulate multiple seasons and use the resulting data to improve the simulation going forward. I will report results as I have them. I’m still very much in the design and learning phase. I’m in the process of reading several books on sabermetrics as well as brushing up on my statistics.



  1. And I expect that over time, there will be more than one experiment.