"Paperback" Reutter: Sudoku data

I was given the 2012 and 2013 "Original Sudoku Calendars" as presents prior to the starts of those years, so naturally I recorded the time it took me to complete each puzzle and the date on which I completed it. A quick perusal of the data shows:

I'm not particularly good at Sudoku; I even started keeping track of when I screwed up and had to erase completely and start over.
There are some missing times; either because I forgot to record it, or because I was unable to finish the puzzle, or (in the case of a few at the beginning of 2012) I hadn't started keeping the dataset yet.
2013 saw the introduction of "visual" sudoku, in which they used various symbols other than Arabic numbers. For a variety of reasons, I'm even worse at these than "regular" sudoku.
I tended to work on the puzzles in clumps, finishing several each evening for a few days (or airplane flight), and then not doing them for a week to a month (or more)

While working on the puzzles in 2012, I suspected that the "Medium" difficulty puzzles weren't any easier than the "Hard" difficulty puzzles; also, after spending so many hours doing sudoku puzzles, I wondered whether I'd gotten any better at solving them. Well, after some painstaking record keeping, we can answer that with data science! So, first a quick look at some descriptive statistics of the time it took to complete each puzzle by puzzle difficulty:

Well, hell. The Puzzle Difficulty is sorted in alpha order, so because I used text in the Excel spreadsheet instead of a numeric encoding with 1 = Very Easy, 2 = Easy, and so on, the table's out of order. So, after a little recoding:

Better! And I even remembered to save the image as a hard-G GIF, rather than a JPG, this time so there's no moiré pattern. What jumps out at me from this is that:

The mean time to complete for each category is about 4 minutes more than the previous category. That's eerie. Also, not great evidence to start out for my theory about Medium vs. Hard.
There are many more missing "Hard" puzzles.
The median values are all lower than the mean values, suggesting that the distribution of time to complete is skew for each difficulty level. That's kind of a "duh" observations, because it was to be expected, but I should check to see just how skew the distribution is.
The maximum values for the "Easy" puzzles is higher than the "Medium" puzzles. This is, I think, due to the "Visual" puzzles.

Breaking the information down further by year, and putting it in a graph so that we can better see the distribution of values, suggests that:

I shouldn't be worried about skewness when going to perform statistical tests; yes, there are some outlying values, especially on Easy, but I think these are all the "Visual" puzzles.
It looks like I may have gotten better at the Easy and Medium puzzles from 2012 to 2013, but was no better at the Hard puzzles.
My theory that I was no better at the Medium puzzles than the Hard puzzles (in 2012) is alive!

A quick look at boxplots, paneled by whether I messed up and whether the puzzle was "visual", confirms for me that I should go ahead and start building models.

So let's do a general linear model of the time to complete based on difficulty, whether the puzzle was visual, whether I messed up, and the year, with all first level interactions.

Lots of good stuff here.

The puzzle difficulty contributes the most to the model (duh, but good to have the confirmation)
Whether the puzzle was visual and whether I messed up and had to start over had roughly equal effects on the model.
year did not have a significant effect on the model, which suggests that I did not, overall, get better from 2012 to 2013; however, the interaction of puzzle difficulty and year is significant, which means that the differences between 2012 and 2013 that we saw in the Easy and Medium puzzles may be real effects, and not just noise
The interaction of PuzzleDifficulty and Visual is a redundant effect, because the visual puzzles were all of Easy difficulty. Likewise, the visual puzzles all appeared in 2013, so Visual*year is redundant.
PuzzleDifficulty*Messedup is statistically significant. This means that messing up on a Hard puzzle will have a different, and almost certainly greater, effect on the time to complete the puzzle than messing up on an Easy puzzle (duh, but good to have the reminder)
Visual*Messedup is also statistically significant. This means that messing up on a Visual puzzle (which we have seen take longer than regular Easy puzzles) will have a different, and almost certainly greater, effect on the time to complete than messing up on a regular Easy puzzle. This is really the same type of effect as PuzzleDifficulty*Messedup
Messedup*year is not significant. Screwing up in 2013 was no better or worse than screwing up in 2012.

We could, at this point, look at the parameter estimates table, but it's big and messy and won't tell us anything important that we can't get from the tests of effects table above and the estimated marginal means table below.

Messing up roughly doubles the amount of time spent on a puzzle. This makes sense; often, it's not until I'd near the end that I'd realize that two 7's were in the same row, column, or square (or the like).
Visual puzzles took about twice as long to finish as regular Easy puzzles, and messing up a visual puzzle compounded the error.
I did, in fact, get better at Easy and Medium puzzles from 2012 and 2013.
(*) I appear to have not gotten better at Hard puzzles from 2012 to 2013, and I appear to have been no better at Medium puzzles than Hard puzzles in 2012

I'm putting an asterisk next to this last conclusion for the important reason that most of the missing Hard puzzles are from 2012. It's extremely likely that these are puzzles that I failed to solve at the time, and would have taken me a long time to solve if I'd kept at them. I would hazard a guess that I did, in fact, improve on Hard puzzles from 2012 to 2013, and that I took less time to finish the Medium puzzles than Hard puzzles in 2012; my belief that the Hard puzzles in 2012 were no more difficult than the Medium puzzles was predicated upon a lack of completion times for the hardest of the Hard puzzles.

Now, I could make certain assumptions about how long it would have taken me to complete the missing puzzles and re-run the model to do some "What if?" analysis, but I want to stop here for now. There are also some other methods for modeling whether I got better at solving these puzzles over time; maybe I'll look at them later.

I should also note that I didn't receive the 2014 calender, and so will continue the dataset using puzzles from http://www.websudoku.com/ (though I haven't attempted any this year, so the future of this dataset is actually not very clear). The biggest benefit to using puzzles from the sudoku.com site is that I can link to the exact puzzle, so that my own dataset could be merged with other datasets kept by other people recording their times.

"Paperback" Reutter

Monday, March 24, 2014

Sudoku data

No comments:

Post a Comment