日本人の年齢別死亡者数の推移をグラフ化する。各年齢層で毎年どれくらい死んでいるのかは以前から気になっていたので、ちょうどいい機会なので調べてみることにした。

データの下準備¶

先ず、このサイトから必要なデータを入手する。

from pandas import *

df = read_csv('dead.csv',encoding='shift_jis')
df.head()

	表章項目	死因年次推移分類	性別	年齢(5歳階級)	/時間軸(年次)	2017年	2016年	2015年	2014年	2010年	…	1995年	1990年	1985年	1980年	1975年	1970年	1965年	1960年	1955年	1950年
0	死亡数【人】	総数	総数	総数	NaN	1340397	1307748	1290444	1273004	1197012	…	922139	820305	752283	722801	702275	712962	700438	706599	693523	904876
1	死亡数【人】	総数	総数	0〜4歳	NaN	2454	2618	2692	2883	3382	…	7040	7983	10834	16298	25884	32879	42572	64692	99399	222903
2	死亡数【人】	総数	総数	5〜9歳	NaN	351	391	452	460	480	…	1235	1377	1791	2773	3205	3809	4533	8209	14240	19774
3	死亡数【人】	総数	総数	10〜14歳	NaN	437	440	470	501	553	…	1184	1242	1649	1627	2045	2625	3621	5545	6548	10212
4	死亡数【人】	総数	総数	15〜19歳	NaN	1161	1166	1220	1205	1422	…	3362	4353	4212	4043	4750	6926	7374	9829	10992	21222

5 rows × 22 columns

df = df.convert_objects(convert_numeric=True)

/root/.pyenv/versions/miniconda3-latest/envs/py368/lib/python3.6/site-packages/ipykernel_launcher.py:1: FutureWarning: convert_objects is deprecated.  To re-infer data dtypes for object columns, use DataFrame.infer_objects()
For all other conversions use the data-type specific converters pd.to_datetime, pd.to_timedelta and pd.to_numeric.
  """Entry point for launching an IPython kernel.

df.drop(['表章項目','死因年次推移分類','性別','/時間軸(年次)'],axis=1,inplace=True)
df.head(3)

	年齢(5歳階級)	2017年	2016年	2015年	2014年	2010年	2005年	2000年	1995年	1990年	1985年	1980年	1975年	1970年	1965年	1960年	1955年	1950年
0	総数	1340397.0	1307748.0	1290444.0	1273004.0	1197012.0	1083796.0	961653.0	922139.0	820305.0	752283.0	722801.0	702275.0	712962.0	700438.0	706599.0	693523.0	904876.0
1	0〜4歳	2454.0	2618.0	2692.0	2883.0	3382.0	4102.0	5269.0	7040.0	7983.0	10834.0	16298.0	25884.0	32879.0	42572.0	64692.0	99399.0	222903.0
2	5〜9歳	351.0	391.0	452.0	460.0	480.0	655.0	738.0	1235.0	1377.0	1791.0	2773.0	3205.0	3809.0	4533.0	8209.0	14240.0	19774.0

0〜4歳の年次別死亡者数をプロット¶

from matplotlib.pyplot import *
from matplotlib.font_manager import FontProperties
from matplotlib import rcParams
style.use('ggplot')

fp = FontProperties(fname='/usr/share/fonts/opentype/ipaexfont-gothic/ipaexg.ttf', size=54)
rcParams['font.family'] = fp.get_name()
rcParams["font.size"] = "20"
fig, ax = subplots(figsize=(25,12))
df.loc[1][1:].sort_values(ascending=False, inplace=False).plot(kind='barh',ax=ax)
xticks(np.arange(0,2.5e5,1e5/10),
   ['{}万'.format(int(x/1e4)) if x > 0 else 0 for x in np.arange(0,2.5e5,1e5/10)])
for i in ax.patches:
    ax.text(i.get_width()+1e3,i.get_y()+.48,\
      int(round((i.get_width()),1)),fontsize=22,fontname='Arial',color='dimgrey',fontweight='bold')
ax.invert_yaxis();

1950年は死に過ぎ。1950年の0〜4歳死亡者数は、(0〜4歳人口比を加味すると)2017年の約40倍といったところだろう。今の日本の子供は昔の子供に比べると病死が著しく減ったので幸せだ。

100歳以上の年次別死亡者数をプロット¶

fig, ax = subplots(figsize=(25,12))
df.loc[21][1:].sort_values(ascending=True, inplace=False).plot(kind='barh',ax=ax)
xticks(np.arange(0,3.1e4,1e4/5),
    ['{}万'.format(float(x/1e4)) if x > 0 else 0 for x in np.arange(0,3.1e4,1e4/5)])
for i in ax.patches:
    ax.text(i.get_width()+1e2,i.get_y()+.48,\
      int(round((i.get_width()),1)),fontsize=22,fontname='Arial',color='dimgrey',fontweight='bold')
ax.invert_yaxis();

過去60年間で、100歳以上死者数は凄まじい勢いで増えている。

2017年の年齢別死亡者数をプロットする¶

fig, ax = subplots(figsize=(25,12))
df[['2017年','年齢(5歳階級)']][:23][1:].set_index('年齢(5歳階級)').plot(ax=ax,kind='barh')
xticks(np.arange(0,2.9e5,1e5/10),
   ['{}万'.format(int(x/1e4)) if x > 0 else 0 for x in np.arange(0,2.9e5,1e5/10)])
ax.legend(["死亡者数"],loc='upper right', prop={'size': 26});

85歳〜89歳が一番多い。5〜14歳はほとんど死んでないことが分かる。5〜19歳よりも0〜4歳の方が死亡者数が多い。年齢不詳もほとんどいないことが見て取れる。