The Moneyball guide to FPL

Seven years’ worth of player statistics, a database, optimising code in Python, countless podcasts, checking pre-season games, scrutinising the fixture list. These are just some of the things I’ve done over the last two weeks as I go for Fantasy Premier League (FPL) glory. I’m all in and I’m sharing my strategy.

If you aren’t familiar with football, Moneyball or FPL, you may have somehow stumbled across the wrong blog post. But if you are curious, or you can’t get enough of these things, read on. For those who don’t know me or Brazilfooty: I have written about Brazilian football for years, but now write about other things like FPL from time to time. And I’m an economist, which basically makes me the Brazilian football blogging equivalent of Brat Pitt’s sidekick in Moneyball.

Achieving FPL glory, a key life objective, is hard. There are more than 10 million players. And it turns out, a community of clever, hard-working bloggers, micro bloggers, podcasters and other heroes, who know their stats and are trying to figure this out. People sell FPL optimising models and one group are so confident in their model that they have promised to give users’ money back if they don’t win their mini league.

Basically, being a Brazilian Moneyball guru counts for nothing anymore. All of that said, I am still confident that my ‘fresh’ approach can yield something useful and I’m convinced that I will finish 1st out of more than 10 million people. Let’s get things straight. This isn’t a game: it is a constrained optimisation problem. Yay. In other words: you optimise something (points) and have a series of constraints (a budget and rules on how many players you can pick, and from which teams, etc) to do this. Maths and data people do this sort of stuff all the time, which means its possible to code, which means its possible to get ChatGPT to do it. So I did it.

Since it has emerged that every Brazilian Tom, Dick and Harry (Fulano, Betrano and Sicrano) has a super FPL model, the main thing I’m doing now is to decide the structure of my team. How much to spend on forwards, midfielders, defenders and goalkeepers? Let’s look at a few basics first. Over the years I’ve noticed that players tend to be more expensive the further up the pitch they play, even though they don’t necessarily score more points. That is true this year as the charts below show, suggesting that having more defenders in your team would be a good strategy. Perhaps a 5-3-2 lineup? Or spending more of your budget on good defenders?

Nice idea, but there are problems with those charts. First, they include all players, even those that scored no points last year (because they played in another league, were injured or whatever). Second, they reflect this year’s cost and last year’s points. Last year’s points are a decent guide, but they don’t necessarily reflect what will happen this year. To partially get around the first problem, I made the same charts using only those players that played more than 50% of all available minutes last year. Result: same story for average price, but a different story for average points. Similar story for average points per £ too, but with one big difference: you get most points per £ for goalkeepers as you do for other positions. Buy the most expensive keeper?

Let’s think about this: out of the players that generally play, on average, you get more points the further forward you go on the pitch, but it costs you more, and you get better value (in terms of points per £) further back. I think the takeaway then is: spend more on premium forwards and midfielders that are likely to get you big points and try to find the best value defenders and goalkeeper(s) to pay for them. The chart below – which shows that the average points per 90 minutes of those players that played 50%+ minutes last year is highest for forwards, followed by midfielders, goalkeepers and then defenders – supports this too.

I’m happy enough with this as an initial takeaway. But there are far fewer forwards and goalkeepers in the ‘played more than 50% of the minutes last year’ group and I have a suspicion that this could be causing problems to the stats. If, for some reason, only the really good defenders and goalkeepers are being included, then this might be unfairly skewing the statistics. And realistically, not every player that plays >50% of minutes would be considered a suitable candidate for inclusion in the game (defensive midfielders and defenders who play for teams that concede a lot of goals for example). A better analysis would include only the better players, so I narrow the pool to include players that I am more likely to pick.

I do this in two ways for each position: 1/ the 10 most expensive players; and 2/ the players in the top 25% based on price. See table below. In turns out that basically, even when looking at these groups, the key points are pretty much the same. But the fact that this is true when you look at the premium players in each position, is important. It means you don’t need to look for your cheap defenders and keepers for good value and you can still get better value per pound spent for premium goalkeeper or defender than you do for premium forward.

How to optimise this? You might want to just load up on defenders because you get better value, but it’s not just about value: you need to maximise points. Too much value and you don’t spend your whole budget. How much big can you afford? And what sort of budget do you get, especially if the premium defenders offer good value, not just the cheap ones? And there are those other constraints too: you need to pick two keepers, five defenders, five midfielders and three forwards in your squad, but only one keeper, at least one forward, three midfielders and three defenders in the starting eleven. Not a problem for Brazilian Moneyball guy. I explained ChatGPT the rules, told it what I wanted to do and asked it to write me some Python code to solve this problem. And after several iterations, corrections, shouting, tears and high fives it gave me the team below.

That’s 2340 points. Pretty good and better than my score last year (2240), my score the year before (2307) and the 2023/23 average (1927). It is a lot lower than the winning score of 2776 though. Interesting, but there are obvious reasons for this: this doesn’t include any subs, it doesn’t consider last year’s prices (which for the likes of Martinelli and co were much lower), bench boost, etc. Interesting, but don’t get hung up on this score. The useful information here is how it selects the team: the formation it goes for and the price range of the players it picks. Takeaways below.

Kane and Haaland. No Salah. No Trent. No Rashford or Saka. Three mid-price midfielders. One super premium defender (Trippier) and two more pricier-than-average defenders (£5.5 and £5.0) and an expensive keeper (£5.0). No bargain playing defenders. Two cheap reserve defenders and a cheap reserve forward. No reserve midfielders. No cheap defenders in the first XI. 3-5-2.

The names are interesting too, but keep in mind that these will change over the course of a season as you make transfers to deal with favourable/unfavourable fixture runs, suspensions and injuries. Also, really you ought to have at least one decent substitute on your bench, who would cost more than the benchwarmers in GPT’s team. And last year’s points are only an indication of this year’s points: new players like Nkunku had no points last year; the points of players that have left (including relegated teams) were not included; the players from the promoted teams have no points; Chelsea might improve and their players score more points, etc.

I’m pretty happy with the lessons learned so far. But I think there is room for improvement. So I looked for some more data with the help of my new best friend ChatGPT. And I managed to find one with seven years’ worth of data, including players that no longer play in the league, with the points and prices of players in relegated teams and ending prices instead of starting prices.

Time to model again, using the new dataset. I ask it to give me the optimal squad (using £100 budget) with no restrictions on positions and teams, and no distinction between starters and benchwarmers meaning that this is the best squad, not necessarily the best team. This is what it gave me (where you see a name more than once that is the same player, but in a different season). A trip down FPL memory lane.

No Haaland. No super expensive player. Several expensive but not premium midfielders (whose prices would have gone up at during the season). Basically, this is telling you to select players in their breakout seasons (easier in hindsight of course). A key thing to note here is that it loads you up with defenders and goalkeepers. Squad formation is 3-6-5-1 (I didn’t give this simulation any restrictions on positions or tell it about the captain).

This confirms a few of the previous discoveries: 1/ get premium defenders; 2/ don’t go for 4.5 defenders; 3/ go for mid-price midfielders with the potential to break out like Rashford, Odegaard and Martinelli in previous seasons (Eze or Mitoma this year?). But it gives some more information too: you have space for one premium midfielder or forward, but only one; 5.0+ keepers are okay, but a sub 5.0 keeper is okay too, perhaps.

The squad above it useful, but not realistic since you wouldn’t pay so much to have four scoring players on your bench, and the squad doesn’t reflect team or squad selection rules. Moreover, it uses the best performances over the last seven years, which are unlikely to all happen again this year. So, I tweak the model and ask it to drop two benchwarmers, who I don’t expect to play and will therefore pay minimum price for. I do this by reducing the overall budget by to £91.5 (I assume a £4.0 and £4.5 benchwarmer) and ask it to maximise the score for a squad of 13 players. Still no constraints on positions, but the formation changes to 2-4-5-2. A keeper gets dropped (interestingly Fabianski also goes in favour of Raya who costs £0.1 more). Two defenders dropped. Premium forward added.

Perhaps I can get away with two players that cost £4.0 on my bench (a £4.0 defender and a £4.0 keeper)? So I set the budget to £92.0 and the optimiser replaces £7.2 midfielder Rashford (205) with £7.7 defender Alexander Arnold (210) to give a 2-5-4-2 formation.

Maybe you only need one decent substitute. Set budget to £87.5 (assumes non-playing reserves cost £4.0, £4.0 and £4.5) and tell it to pick 12 players. Formation changes to 1-5-6-0. Premium Son replaces premium Kane. Budget Bamford is out. Premium defenders, mid-priced breakout midfielders confirmed. £5.0 million-ish keeper confirmed too.

How about a team of 11 players and ignore the bench? I do it but assume the non-playing subs cost £18.5 combined: £4.0, £4.0, £4.5 and £6.0. £6.0 is costly, but this is the lowest price of a decent playing and probably point-scoring forward like Wissa or DCL. Budget £81.5. Select me 11 players. Son goes. Alexander Arnold goes twice (cheap AA and expensive AA). Kane comes back in.

Maybe £6.0 for a non-playing player is a too much (Joao Pedro costs £5.5m after all). Adjust budget to £82.0. Trent AA comes back in. Proposed formation is 1-4-5-1, which is almost a realistic one, even though I haven’t added any position constraints. No budget players £4.5s or £5.0s, even in goal. Premium defenders all the way. A premium midfielder and a premium forward. Fill the rest with breakout mid-price midfielders. A picture is starting to emerge. 

Still reading? I’m not done. 😊 The next step is to use the code on another group of players. The simulations so far are run on all players over the last seven nears, including the best players, at their best price. This isn’t a reflective sample of what is available to us this year. And even if it was, it is unlikely that anybody could pick, beforehand, all the best players. So, I reduce the sample, by removing the best performing 20% of all players in each position (on the points per £ metric).

Doing this and running the code with no restrictions is wild. Budget £82.0. Select 11 players. It gives an 8-0-3-0 formation. What the heck!!!??? Conclusion: go big on keepers and select three premium midfielders. Obviously not possible. But this tells us that when there is uncertainty in the equation, and when the superstars are out of the equation, keepers offer the best value.

Keep going. At this point, we need to add more constraints. So I tell the model to include a minimum of one keeper, three defenders, three midfielders and one forward. That is our constraint. Budget still £82.0. And we get another mind blower: 1-3-3-5. Only one forward or midfield premium, but premium keeper and three premium defenders. Lot’s of noise but still one conclusions: one or two premium forward/midfielders, premium defenders and a top keeper.

Using this group of data, I did a little more tinkering by changing the number of players (11 or 12) and using different bench budgets. A few interesting names and unrealistic formations pop up, but the key takeaway is that the key existing takeaways still stand, with a slight nuance: you can select slightly cheaper in defence and midfield. That is probably because when the top quintile is excluded, when the superstars in each position are gone (and a more likely choice to the one facing you now), some of the lower costing budget players become more attractive. Still no £4.5 defenders or £5.5 midfielders though.

Now its time to start experimenting with actual formations and proper boundaries. To do this, I did need to consider the lowest cost player per position and the lowest costs per position of a player that is likely to play. Lowest costs: £4.0m GK; £4.0 DEF; £4.5m MID and £4.5 FWD. Lowest cost for players playing regularly: £4.0m for keepers and defenders, but (I reckon) £0.5m – £1.5m more than the lowest cost midfielders or forwards for players in these positions that are likely to get reliable minutes.

It turns out that having an extra £1.0m to spend on your benchwarmers (the difference between a bench with at least one player who gets minutes and another with someone who doesn’t get minutes) would only reduce the total points of the 11 players by 16 points over the whole season. Nobody wants wasted points on the bench, but I reckon you’ll want to sacrifice 16 points to have a better first substitute to cover injuries or suspensions. I keep this rule in mind as I do simulations for all the possible formations using the three main sets of data analysed so far: 2023-24 all players; players with >50% of minutes in last seven years; players with >50% of minutes in last seven years excluding top 20% of points per £ per position.

The colour coding shows the best and worst formations for each of the datasets. There isn’t a massive difference, and you could read this in different ways. Go with whatever formation you prefer – you can make all of them work, providing you make the right choices and you have the right players. But my conclusion for my starting XI is that 3-5-2 is the best and I would steer clear of five at the back. Overall, the lessons don’t change a whole lot: go premium defenders; mid-range breakout midfielders; two premiums combined from your forward and midfield positions.

I’m at the stage now where I’ve done a lot, shared a lot and need to hit the publish button. To complete the analysis I really ought to analyse the distribution of scores by position, price and score to give an indication of ‘risk’. The optimisation I have done is based off numbers that happened – knowing the level of uncertainty associated with those outcomes before they happened is important and could yield different results. Something for another day.

The bottom line is that there isn’t any golden rule, but I do think the following pointers can help:

  • Don’t go five at the back
  • Three at the back and 3-5-2 works
  • One or two premium midfielders and forwards
  • Two, maybe three mid-priced midfielders
  • Expensive or premium defenders and goalkeepers

After all that has now been said and done, there are now only three things left for me to do. First, suggest you take a look at the chart below. Post your comments and thoughts at the bottom of this page, on the back of a postcard or into a bottle labelled Brazilfooty which you throw into the ocean.

Second, I challenge you to join the Friends of Brazilfooty mini league. Glory potentially awaits if you join and beat me after I spent all this time modelling with the help of my supercomputer friend. And last but not least, you must be wondering my draft for Gameweek 1, so here it is. Health warning, I’ve broken some of my own rules and things may change before Friday. While I’ve considered the lessons learned described in this post, nothing beats checking the fixture list, planning subs for the next few weeks and going with general good vibes. Good luck in the game folks, and I hope you’ve enjoyed this post.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.