Ce fichier de données fournit une vue granulaire de 208 446 matchs de football joués en France depuis la saison 2011/2012 à la saison 2016/2017.

data_event

Format

Un objet data.frame avec 208446 lignes et 40 variables :

X

Integer vector, identifiant unique de ligne

id_odsp

Factor w/ 2076 levels, identifiant unique de match

id_event

Factor w/ 208446, unique identifier of event (id_odsp + sort_order)

sort_order

Integer vector, chronological sequence of events in a game

time

Integer vector, minute of the game

text

Factor w/ 79629 levels, text commentary

event_type

Integer vector, primary event. 11 unique events (1-Attempt(shot), 2-Corner, 3-Foul, 4-Yellow Card, 5-Second yellow card, 6-(Straight) red card, 7-Substitution, 8-Free kick won, 9-Offside, 10-Hand Ball, 11-Penalty conceded)

event_type2

Integer vector, secondary event. 4 unique events (12 - Key Pass, 13 - Failed through ball, 14-Sending off, 15-Own goal)

side

Integer vector, 1-Home, 2-Away

event_team

Factor w/ 30 levels, Équipe de football qui est à l’origine de l’événement. In case of Own goals, event team is the team that beneficiated from the own goal

opponent

Factor w/ 30 levels, team that the event happened against

player

Factor w/ 1609, name of the player involved in main event (converted to lowercase and special chars were removed)

player2

Factor w/ 1498, name of player involved in secondary event

player_in

Factor w/ 1277, player that came in (only applies to substitutions)

player_out

Factor w/ 1204, player substituted (only applies to substitutions)

shot_place

Integer vector, placement of the shot (13 possible placement locations, available in the dictionary, only applies to shots)

shot_outcome

Integer vector, 4 possible outcomes (1-On target, 2-Off target, 3-Blocked, 4-Hit the post)

is_goal

Integer vector, binary variable if the shot resulted in a goal (own goals included)

location

Integer vector, location on the pitch where the event happened (19 possible locations, available in the dictionary)

bodypart

Integer vector, (1- right foot, 2-left foot, 3-head)

assist_method

Integer vector, in case of an assisted shot, 5 possible assist methods (details in the dictionary)

situation

Integer vector, 4 types: 1-Open Play, 2-Set piece (excluding Direct Free kicks), 3-Corner, 4-Free kick

fast_break

Integer vector, binary

link_odsp

Factor w/ 2076 levels lien vers la page oddsportal

adv_stats

Logical vector, boolean if the game has detailed event data

date

Factor w/ 592 levels, Date of game

league

Factor w/ 1 level, Club League

season

Integer vector, Year Played

country

Factor w/ 1 level, Host Nation of League

ht

Factor w/ 30 levels, home team

at

Factor w/ 30 levels, away team

fthg

Integer vector, full time home goals

ftag

Integer vector, full time away goals

odd_h

Numerical vector, highest home win market odds

odd_d

Numerical vector, highest draw market odds

odd_a

Numerical vector, highest away market odds

odd_over

Numerical vector, highest over 2.5 market odds

odd_under

Numerical vector, highest under 2.5 market odds

odd_bts

Numerical vector, highest both teams to score market odds

odd_bts_n

Numerical vector, highest both teams NOT to score market odds

Source

Kaggle.

Details

Ces données sont une version « nettoyée » d’un fichier original, events_France.csv, qu’il est possible de télécharger depuis la plate-forme Kaggle : https://www.kaggle.com/secareanualin/football-events. Certains matchs contiennent cependant des données manquantes (environ 10