8. 折れ線グラフ

8.1 日本人と中国人の平均寿命

8.1 Exercise

8.2 自民党と民進党の当選者年齢

8.3 Exercise

8.4 米国下院の政治的二極化（経済的次元)

8.4 Exercise

9. コロプレス地図

9.1. 世界地図

9.2. 日本地図

9.3 都道府県（市町村）地図

10. ドットプロット

10.1. 基本的なドットプロット

このセクションで使っている packages

library("tidyverse")

## ─ Attaching packages ──────────────────── tidyverse 1.3.0 ─

## ✓ ggplot2 3.3.1     ✓ purrr   0.3.4
## ✓ tibble  3.0.1     ✓ dplyr   1.0.0
## ✓ tidyr   1.1.0     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.5.0

## ─ Conflicts ───────────────────── tidyverse_conflicts() ─
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

メッセージの意味：
・tidyverse パッケージは 2 つの関数 — filter()とlag() — とコンフリクトを起こす
→ このセクションでは後に filter()関数を使う
→その際には次のようにコマンドを入力する必要がある
filter() → dplyr::filter()

library("gapminder")
library("DT")

・ggplot で文字バケしない設定

theme_set(theme_classic(base_size = 10,
                        base_family = "HiraginoSans-W3"))

8. 折れ線グラフ

8.1 日本人と中国人の平均寿命

Gapminder

・Gapminderは R に組み込まれているデータセット
・含まれる変数は次のとおり：
(1) country: 国名
(2) continent: 大陸名
(3) lifeExp: 寿命平均
(4) pop: 人口
(5) gdpPercap: 一人当たりGDP
(6) year: 1952-2007 (every 5 years)

・Rに組み込まれているデータの様子をのぞき見て (glimpse) みる

data(gapminder)
glimpse(gapminder)

Rows: 1,704
Columns: 6
$ country   <fct> Afghanistan, Afghanistan, Afghanistan, Afghanistan, Afghan…
$ continent <fct> Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia…
$ year      <int> 1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992, 1997…
$ lifeExp   <dbl> 28.801, 30.332, 31.997, 34.020, 36.088, 38.438, 39.854, 40…
$ pop       <int> 8425333, 9240934, 10267083, 11537966, 13079460, 14880372, …
$ gdpPercap <dbl> 779.4453, 820.8530, 853.1007, 836.1971, 739.9811, 786.1134…

・gapminder のサマリーは

summary(gapminder)

        country        continent        year         lifeExp     
 Afghanistan:  12   Africa  :624   Min.   :1952   Min.   :23.60  
 Albania    :  12   Americas:300   1st Qu.:1966   1st Qu.:48.20  
 Algeria    :  12   Asia    :396   Median :1980   Median :60.71  
 Angola     :  12   Europe  :360   Mean   :1980   Mean   :59.47  
 Argentina  :  12   Oceania : 24   3rd Qu.:1993   3rd Qu.:70.85  
 Australia  :  12                  Max.   :2007   Max.   :82.60  
 (Other)    :1632                                                
      pop              gdpPercap       
 Min.   :6.001e+04   Min.   :   241.2  
 1st Qu.:2.794e+06   1st Qu.:  1202.1  
 Median :7.024e+06   Median :  3531.8  
 Mean   :2.960e+07   Mean   :  7215.3  
 3rd Qu.:1.959e+07   3rd Qu.:  9325.5  
 Max.   :1.319e+09   Max.   :113523.1

・country の一覧を表示　　

unique(gapminder$country)

  [1] Afghanistan              Albania                  Algeria                 
  [4] Angola                   Argentina                Australia               
  [7] Austria                  Bahrain                  Bangladesh              
 [10] Belgium                  Benin                    Bolivia                 
 [13] Bosnia and Herzegovina   Botswana                 Brazil                  
 [16] Bulgaria                 Burkina Faso             Burundi                 
 [19] Cambodia                 Cameroon                 Canada                  
 [22] Central African Republic Chad                     Chile                   
 [25] China                    Colombia                 Comoros                 
 [28] Congo, Dem. Rep.         Congo, Rep.              Costa Rica              
 [31] Cote d'Ivoire            Croatia                  Cuba                    
 [34] Czech Republic           Denmark                  Djibouti                
 [37] Dominican Republic       Ecuador                  Egypt                   
 [40] El Salvador              Equatorial Guinea        Eritrea                 
 [43] Ethiopia                 Finland                  France                  
 [46] Gabon                    Gambia                   Germany                 
 [49] Ghana                    Greece                   Guatemala               
 [52] Guinea                   Guinea-Bissau            Haiti                   
 [55] Honduras                 Hong Kong, China         Hungary                 
 [58] Iceland                  India                    Indonesia               
 [61] Iran                     Iraq                     Ireland                 
 [64] Israel                   Italy                    Jamaica                 
 [67] Japan                    Jordan                   Kenya                   
 [70] Korea, Dem. Rep.         Korea, Rep.              Kuwait                  
 [73] Lebanon                  Lesotho                  Liberia                 
 [76] Libya                    Madagascar               Malawi                  
 [79] Malaysia                 Mali                     Mauritania              
 [82] Mauritius                Mexico                   Mongolia                
 [85] Montenegro               Morocco                  Mozambique              
 [88] Myanmar                  Namibia                  Nepal                   
 [91] Netherlands              New Zealand              Nicaragua               
 [94] Niger                    Nigeria                  Norway                  
 [97] Oman                     Pakistan                 Panama                  
[100] Paraguay                 Peru                     Philippines             
[103] Poland                   Portugal                 Puerto Rico             
[106] Reunion                  Romania                  Rwanda                  
[109] Sao Tome and Principe    Saudi Arabia             Senegal                 
[112] Serbia                   Sierra Leone             Singapore               
[115] Slovak Republic          Slovenia                 Somalia                 
[118] South Africa             Spain                    Sri Lanka               
[121] Sudan                    Swaziland                Sweden                  
[124] Switzerland              Syria                    Taiwan                  
[127] Tanzania                 Thailand                 Togo                    
[130] Trinidad and Tobago      Tunisia                  Turkey                  
[133] Uganda                   United Kingdom           United States           
[136] Uruguay                  Venezuela                Vietnam                 
[139] West Bank and Gaza       Yemen, Rep.              Zambia                  
[142] Zimbabwe                
142 Levels: Afghanistan Albania Algeria Angola Argentina Australia ... Zimbabwe

平均寿命の推移（日本・中国）

・必要なデータ
(1) year: 1952-2007 (every 5 years)
(2) lifeExp:（寿命平均）
(3) 国名 (country)

平均寿命の時系列データ（日本人）

・日本人のデータだけを抜き出して Japan と名前を付ける

Japan <- gapminder %>%
  dplyr::filter(country == "Japan") %>%
  dplyr::select(year, lifeExp) # year と lifeExp だけを抜き出す

Japan # 抜き出したデータの中身を表示して確認

# A tibble: 12 x 2
    year lifeExp
   <int>   <dbl>
 1  1952    63.0
 2  1957    65.5
 3  1962    68.7
 4  1967    71.4
 5  1972    73.4
 6  1977    75.4
 7  1982    77.1
 8  1987    78.7
 9  1992    79.4
10  1997    80.7
11  2002    82  
12  2007    82.6

・日本人の寿命の変遷だけを時系列的に表示する

ggplot(Japan, aes(x = year, y = lifeExp)) +
  geom_point() +
  geom_line()

平均寿命の時系列データ（日本人と中国人）

・日本人と中国人の寿命の変遷データを抜き出して jpn.chi と名前を付ける

jpn.chi <- gapminder %>%
  dplyr::filter(country == "China" | country == "Japan") %>%
  dplyr::select(year, country, lifeExp)　# year、country、lifeExp だけを抜き出す

jpn.chi

# A tibble: 24 x 3
    year country lifeExp
   <int> <fct>     <dbl>
 1  1952 China      44  
 2  1957 China      50.5
 3  1962 China      44.5
 4  1967 China      58.4
 5  1972 China      63.1
 6  1977 China      64.0
 7  1982 China      65.5
 8  1987 China      67.3
 9  1992 China      68.7
10  1997 China      70.4
# … with 14 more rows

・DT の datatable()関数を使ってデータをみやすく表示する

DT::datatable(jpn.chi)

Show entries

Search:

	year	country	lifeExp
1	1952	China	44
2	1957	China	50.54896
3	1962	China	44.50136
4	1967	China	58.38112
5	1972	China	63.11888
6	1977	China	63.96736
7	1982	China	65.525
8	1987	China	67.274
9	1992	China	68.69
10	1997	China	70.426

Showing 1 to 10 of 24 entries

Previous1 2 3Next

・日本人と中国人の寿命の変遷を表すデータを時系列的に表示する

ggplot(jpn.chi, aes(x = year, y = lifeExp, color = country)) +
  geom_point() +
  geom_line() +
  ggtitle("Life Expectancy: Japan and China 1952-2007")

8.1 Exercise

・R に組み込まれているデータ (Gapminder) を使って、1952年から2007年まで日本と任意の二カ国、合計三カ国の平均寿命を時系列的な線グラフで示しなさい。

・Gapminderは R に組み込まれているデータセット
・含まれる変数は次のとおり：
(1) 国名 (country)
(2) 大陸名 (continent)
(3) 1952-2007: every 5 years (year)
(4) 人間の寿命 (lifeExp)
(5) 人口 (pop)
(6) 一人当たりGDP (gdpPercap)

8.2 自民党と民進党の当選者年齢

・1996年から 2014年まで実施された衆議院選挙データセット ( hr96_14.csv ) を読み込み、hr と名前をつける

hr <- read_csv("hr96_14.csv", na = ".")

・データフレーム hr に含まれている変数を確認。

names(hr)

 [1] "year"       "ku"         "kun"        "party"      "party_code"
 [6] "name"       "age"        "status"     "nocand"     "rank"      
[11] "wl"         "previous"   "votes"      "voteshare"  "eligible"  
[16] "turnout"    "exp"        "exppv"      "ldp"

・次の四つの変数を使って、1996年から2014年までの衆議院選挙の小選挙区当選者の年齢（中央値）の推移に関して、自民党と民主党それぞれの時系列グラフを描いてみる

year : 選挙が実施された年 (1996-2014)
age : 立候補者の年齢
wl : 0 = 小選挙区落選、1 = 小選挙区当選、2 = 復活当選
party : 立候補者が所属する政党、LDP = 自民党、DPJ = 民主党

・このような線グラフを描くためには次のようなデータセットが必要

・age.median・・・選挙ごと (1996〜2014) における当選者年齢の中央値
・age.median・・・フレームワークに含まれいない変数 => 新たに作る必要がある
・age.median・・・選挙ごと、政党ごとに計算する

`dplyr package`を使った age.median の計算

・計算する前に age の様子を確認する

summary(hr$age)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  25.00   42.00   51.00   50.67   59.00   94.00       5

・ageに欠損値 (NA’s) が 5 つあることを確認

party.median <- hr %>%                                # party.medianとして保存
  dplyr::filter(party == "LDP" | party == "DPJ") %>%  # 自民党と民主党だけを残す
  dplyr::filter(wl == 1) %>% 　　　　　　　　　　     　　# 当選者 (wl = 1) だけを残す
  drop_na(age) %>%                                    # ageの5つの欠損値をドロップ
  group_by(year, party) %>%                           # year毎、party毎に計算する  
  summarise(age.median = median(age))   # ageの中央値の平均を age.maeian として保存

・datatable()関数を使ってをインターアクティブなデータを表示させる。

DT::datatable(party.median)

Show entries

Search:

	year	party	age.median
1	1996	DPJ	48
2	1996	LDP	55
3	2000	DPJ	49
4	2000	LDP	57
5	2003	DPJ	46
6	2003	LDP	56
7	2005	DPJ	47.5
8	2005	LDP	54
9	2009	DPJ	47
10	2009	LDP	56

Showing 1 to 10 of 14 entries

Previous1 2Next

・これで線グラフを描くために必要なデータが揃った

線グラフを描く

ggplot(party.median, aes(x = year, y = age.median, 
                         color = party, linetype = party, shape = party)) +
  geom_point() +
  geom_line() +
  lims(y = c(40, 60)) +    　　# y 軸の範囲を指定
  ggtitle("政党別当選者年齢の中央値 (1996-2014)") +
  scale_color_discrete(name ="政党",
                       labels = c("民主党", "自民党")) +
  scale_linetype_discrete(name ="政党",
                          labels = c("民主党", "自民党")) + 
  scale_shape_discrete(name ="政党",
                          labels = c("民主党", "自民党")) +　　
  ggtitle("当選者年齢の推移 (自民党と民主党：1996年〜2014年)") +
  labs(x = "総選挙年", y = "年齢の中央値")

8.3 Exercise

・1996年から 2014年まで実施された衆議院選挙データセット ( hr96_14.csv ) を読み込み、hr と名前をつける

・次の四つの変数を使って、1996年から2014年までの衆議院選挙の小選挙区当選者の得票率（中央値）の推移に関して、自民党と民主党それぞれの時系列グラフを描きなさい。

year: 選挙が実施された年
voteshare: 立候補者の得票率 (%)
wl: 0 = 小選挙区落選、1 = 小選挙区当選、2 = 復活当選
party: 立候補者が所属する政党

8.4 米国下院の政治的二極化（経済的次元)

米国下院の政治的二極化（経済的次元) ・第80回 (1947-1948) 〜第112回 (2011-2012) 米国下院における法案に関する全ての議員の理想点
・DW-NOMINATE score
・dwnom1（x 軸）：経済問題・・・　-1（リベラル）〜 1（保守的）
・dwnom2（y 軸）：人種問題・・・　-1（リベラル）〜 1（保守的）
・Sample size: 14552

Source: Nolan McCarty, Keith T. Poole, and Howard Rosenthal (2006) Polarized America:The Dance of Ideology and Unequal Riches. MIT Press.

経済的次元データ (dwnom1）を使って議会期別の中央値を政党別にプロット

・このような線グラフを描くためには次のようなデータセットが必要

・米国下院における法案に関する全ての議員の理想点
・DW-NOMINATE score のデータセット ( congress.csv ) を読み込み、congress と名前をつける

・サーベイデータを読み込む

US <- read_csv("congress.csv")

・計算する前に congress の様子を確認する

summary(US$congress)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  80.00   88.00   96.00   96.01  104.00  112.00

・欠損値はない

`tidyverse` を使った congress.median の計算

US <- US %>%                                                    　# US として保存
  dplyr::filter(party == "Republican" | party == "Democrat") %>%  # Rep と Dem だけを残す
  group_by(congress, party) %>%                                  # congress毎、party毎に計算する  
  summarise(econ.median = median(dwnom1))         # dwnom1 の中央値の平均を econ.maeian として保存

・datatable()関数を使ってをインターアクティブなデータを表示させる。

DT::datatable(US)

Show entries

Search:

	congress	party	econ.median
1	80	Democrat	-0.126499995589256
2	80	Republican	0.265999987721443
3	81	Democrat	-0.207000002264977
4	81	Republican	0.261999994516373
5	82	Democrat	-0.178999997675419
6	82	Republican	0.261000007390976
7	83	Democrat	-0.17399999499321
8	83	Republican	0.257499992847443
9	84	Democrat	-0.222500003874302
10	84	Republican	0.251000002026558

Showing 1 to 10 of 66 entries

Previous1 2 3 4 5 6 7Next

・線グラフを描く

ggplot(US, aes(x = congress, y = econ.median, color = party)) +
  geom_point() +
  geom_line() +
  theme_bw()　+               # 背景色を白にしたい場合 
  ggtitle("Political Polarization: ECON Dimention(US Congress:1947-2012)") +
  labs(x = "Congress",　　　　# x 軸のラベルを指定
       y = "DW-NOMINATE score (economic dimention)")　# y 軸のラベルを指定

・DW-NOMINATE score（経済次元）が時系列的にどのように変化しているかわかる
・民主党議員と共和党議員の経済問題に関するイデオロギーの中心は第95回議会あたりから分岐している
・近年、民主党はよりリベラル（−）になり、共和党はより保守化（＋）している
= 政治的二極化 (political ploarization)

8.4 Exercise

Source: Nolan McCarty, Keith T. Poole, and Howard Rosenthal (2006) Polarized America:The Dance of Ideology and Unequal Riches. MIT Press.

・米国第80回 (1947-1948) 〜第112回 (2011-2012) 米国下院における法案に関する全ての議員の理想点
・DW-NOMINATE score
・dwnom1：経済問題・・・　-1（リベラル）〜 1（保守的）
・dwnom2：人種問題・・・　-1（リベラル）〜 1（保守的）
・Sample size: 14552

人種的的次元データ (dwnom2) の議会期別中央値を政党別にプロットしなさい
・米国下院における法案に関する全ての議員の理想点
・DW-NOMINATE score のデータセット ( congress.csv ) を読み込み、congress と名前をつける

9. コロプレス地図

・コロプレス地図 (choropleth map) とは統計数値を地図で表すひとつの方法
・国別や都道府県別（計量政治学では選挙区別）に色彩や明暗によって表す
・人口の多寡で色分けした地図を描きたい　→　必要なデータは次の二つ

(1) 地図データ

(2) 人口データ

・二つのデータを入手したら、二つのデータをマージ →　地図を描く

library("tidyverse")
library("jpndistrict")
library("sf")
library("magrittr")
library("ggimage")
library("DT")

9.1. 世界地図

・ここでは2016年の各国の人口のコロプレス地図を作成する
・プロジェクトフォルダ内にあらかじめdataという名前のフォルダを作成しておく

9.1.1. 世界地図データ

World Population Prospects 2019から世界の人口のデータをダウンロード

・チャンク内で下のコマンドを実行　→　自動的に data 内に地図データがダウンロードされる

download.file(url = "https://population.un.org/wpp/Download/Files/1_Indicators%20(Standard)/CSV_FILES/WPP2019_TotalPopulationBySex.csv",
              destfile = "data/world_pop.csv")

9.1.2. 国別人口データ(2016年)

・データを Web からダウンロード →　(world_pop.csv)を入手
→dataフォルダの中にダウンロードしたworld_pop.csvを入れる
・データを読み取り df_world_popと名前を付ける

df_world_pop <- read_csv("data/world_pop.csv") %>% 
    filter(Time == 2016) %>%       
    rename(country = Location) %>%             # 変数Locationをcountryに変更
    filter(!str_detect(country, "China, "), LocID != 1105,
           LocID != 850, LocID != 1111) %>% 
    mutate(
        country = str_remove_all(country, c(" of America")), # of America の部分を削除 
        country = str_replace_all(country, c("Russian Federation" = "Russia")) # 名称を簡略化  
        ) %>% 
    select(Time, LocID, country, PopTotal) # 4 つの変数だけを選ぶ

このデータは2016年の各国の人口のデータである。このデータフレームには以下の変数が含まれている

列	変数名	説明
1	`Time`	年
2	`LocID`	国ID
3	`country`	国名
4	`PopTotal`	人口（人）

・インターアクティブな変数表示

DT::datatable(df_world_pop)

Show entries

Search:

	Time	LocID	country	PopTotal
1	2016	4	Afghanistan	35383.028
2	2016	903	Africa	1213040.542
3	2016	1823	African Group	1211379.638
4	2016	1560	African Union	1211918.382
5	2016	2080	African Union: Central Africa	140710.302
6	2016	2081	African Union: Eastern Africa	353241.036
7	2016	2082	African Union: Northern Africa	192623.121
8	2016	2083	African Union: Southern Africa	168186.105
9	2016	2084	African Union: Western Africa	357157.818
10	2016	1200	African, Caribbean and Pacific (ACP) Group of States	1064364.411

Showing 1 to 10 of 471 entries

Previous1 2 3 4 5…48Next

・rnaturalearth::ne_countries() 関数を使って世界地図のデータを取得する。

df_world_sf <- rnaturalearth::ne_countries(returnclass = "sf") %>% 
    rename(country = sovereignt) %>% 
    select(country, geometry) %>% 
    filter(country != "Antarctica") %>% 
    mutate(
        country = str_remove_all(country, c(" of America"))
        )

・地図データの中身を確認

head(df_world_sf)

Simple feature collection with 6 features and 1 field
geometry type:  MULTIPOLYGON
dimension:      XY
bbox:           xmin: -73.41544 ymin: -55.25 xmax: 75.15803 ymax: 42.68825
CRS:            +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
               country                       geometry
1          Afghanistan MULTIPOLYGON (((61.21082 35...
2               Angola MULTIPOLYGON (((16.32653 -5...
3              Albania MULTIPOLYGON (((20.59025 41...
4 United Arab Emirates MULTIPOLYGON (((51.57952 24...
5            Argentina MULTIPOLYGON (((-65.5 -55.2...
6              Armenia MULTIPOLYGON (((43.58275 41...

9.1.3. 地図データと人口データのマージ

・人口データ(df_world_pop) と地図データ (df_world_sf) をマージして ‘df_world’ というデータフレームを作る

df_world <- df_world_pop %>% 
  merge(df_world_sf, by = "country", all = T) %>% 
  distinct(LocID, country,.keep_all = TRUE) %>% 
  rownames_to_column() %>% 
  st_as_sf()

names(df_world)

[1] "rowname"  "country"  "Time"     "LocID"    "PopTotal" "geometry"

9.1.4. 世界地図を描く

・これで地図を描く準備完了
・地図をアウトプットする際、‘html’ 上で適切な大きなにするためチャンクオプションをfig.height=45, fig.width=45 と設定

map_pop_world <- df_world %>% 
  ggplot() + 
  geom_sf(aes(fill = PopTotal)) +
  scale_fill_distiller(name = "人口",
                       palette = "Reds", direction = 1, 
                       na.value = "grey", guide = "legend") +
  theme_map(base_family = "HiraginoSans-W3") +
  theme(legend.position = c(0.1, 0.3),
        legend.title = element_text(size = 45), 
        legend.text = element_text(size = 40),
        legend.key.size = unit(5, "cm"),
        legend.key.width = unit(4.5,"cm")) +
  coord_sf(datum = NA) 

map_pop_world

人口規模別世界地図 (2016)

・プロジェクト内に figというフォルダを作成
→ figを指定して作成したマップを保存　　 →名前は任意につける：ここでは map_pop_world と名付けている

・作成したマップを fig に保存する

ggsave("fig/map_pop_world.png", map_pop_world, width = 45, height = 45)

9.2. 日本地図　　

9.2.1. 日本地図データ

・地図を描くために必要なデータ jpndistrict パッケージをロードする

library(jpndistrict)

jpndistrict パッケージに含まれている都道府県別の地図データを読みとる
→ df_sf_jaという名前を付ける

df_sf_ja <- 1:47 %>%
  map(~ jpn_pref(pref_code = ., district = FALSE)) %>% 
  reduce(rbind) %>% 
  st_simplify(dTolerance = 0.01)

・日本地図に沖縄の地図を加える

df_sf_ja_omit47 <- df_sf_ja %>% 
  filter(pref_code != "47")
df_sf_ja_okinawa <- df_sf_ja %>%
  filter(pref_code == "47")
df_sf_ja_okinawa$geometry %<>%
  add(c(5.6, 17.5))
df_sf_ja_okinawa %<>% st_set_crs(value = 4326)

df_sf_ja<- df_sf_ja %>% 
  filter(pref_code != "47") %>% 
  bind_rows(df_sf_ja_okinawa)

names(df_sf_ja)

[1] "pref_code"  "prefecture" "geometry"

9.2.2. 日本人口データ(2016年)

*ここでは2015年の「都道府県別人口」データ（朝日新聞2016年2月27日）を加工したもの (jpn_pop.csv) を使う

*都道府県別人口データ (jpn_pop.csv) をダウンロードし、RProjctフォルダー内にdataフォルダを作り、その中に入れる
・RStudio 上で人口データ (jpn_pop.csv) を読み込み、df_jpn_pop と名前をつける

・dataフォルダに入れた日本の人口データjpn_pop.csvを読み取る

df_jpn_pop <- read_csv("data/jpn_pop.csv", 
                       locale = locale(encoding = "cp932"))

9.2.3. 地図データと人口データのマージ

・df_sf_ja と df_jpn_pop をマージする => df_jap
・prefecture を手がかりにデータをマージする

df_jap <- df_jpn_pop %>% 
    left_join(df_sf_ja, by = "prefecture") %>%   
    st_as_sf()

・このデータは2016年の都道府県別の人口のデータ
・このデータフレームには以下の変数が含まれている

head(df_jap)

Simple feature collection with 6 features and 5 fields
geometry type:  GEOMETRY
dimension:      XY
bbox:           xmin: 128.5369 ymin: 24.2262 xmax: 153.9869 ymax: 45.38578
CRS:            EPSG:4326
# A tibble: 6 x 6
  pref   prefecture   total  diff pref_code                             geometry
  <chr>  <chr>        <dbl> <dbl> <chr>                           <GEOMETRY [°]>
1 okina… 沖縄県      1.43e6   3   47        MULTIPOLYGON (((136.8849 43.4615, 1…
2 tokyo  東京都      1.35e7   2.7 13        MULTIPOLYGON (((153.972 24.28716, 1…
3 aichi  愛知県      7.48e6   1   23        MULTIPOLYGON (((136.6788 35.22858, …
4 kanag… 神奈川県    9.13e6   0.9 14        MULTIPOLYGON (((139.0036 35.29035, …
5 saita… 埼玉県      7.26e6   0.9 11        POLYGON ((138.8275 36.03807, 138.86…
6 fukuo… 福岡県      5.10e6   0.6 40        MULTIPOLYGON (((130.0623 33.49888, …

列	変数名	説明
1	`Pref`	都道府県名（英語）
2	`Prefecture`	都道府県名（日本語）
3	`total`	人口（人）
4	`diff`	2010〜2015の増減率(%)
5	`geometry`	地図情報

9.2.4. 日本地図を描く

map_pop_jpn_1 <- df_jap %>% 
    ggplot() +
    geom_sf(aes(fill = total)) +                                # total を指定
    geom_segment(aes(x = round(st_bbox(df_sf_ja_omit47)[1], 0),
                     xend = 132.5, y = 40, yend = 40)) + 
    geom_segment(aes(x = 132.5, xend = 138, 
                     y = 40, yend = 42)) +
    geom_segment(aes(x = 138, xend = 138,
                     y = 42, yend = round(st_bbox(df_sf_ja_omit47)[4],  0))) +
    scale_fill_distiller(name = "人口",
                         palette = "Greens", direction = 1) +   # 任意の色を指定
    theme_map(base_family = "HiraginoSans-W3") +
    theme(legend.position = c(0.8, 0.5),
          legend.title = element_text(size = 55), 
          legend.text = element_text(size = 50),
          legend.key.size = unit(5, "cm"),
          legend.key.width = unit(4.5,"cm")) +
    coord_sf(datum = NA) 

map_pop_jpn_1

人口規模・都道府県別日本地図 (2016)

・プロジェクト内に figというフォルダを作成 → figを指定して作成したマップを保存　　 →名前は任意につける：ここでは map_pop_jpn_1と名付けている
・作成したマップを fig に保存する

ggsave("fig/map_pop_jpn_1.png", map_pop_jpn_1, width = 45, height = 45)

9.2.5. 都道府県別の人口増減マップ

都道府県ごとの人口の増減を色で表す。
増加を青、減少を赤で表してみる。
濃いブルーほど人口が増加率が高く、濃い赤ほど減少率が高い都道府県を示す。

・チャンクオプションに{r, fig.align = 'center', fig.height = 45, fig.width = 45}と入力する。

map_pop_jpn_2 <- df_jap %>% 
    ggplot() +
    geom_sf(aes(fill = diff)) +                                  # diff を指定
    geom_segment(aes(x = round(st_bbox(df_sf_ja_omit47)[1], 0),
                     xend = 132.5, y = 40, yend = 40)) + 
    geom_segment(aes(x = 132.5, xend = 138, 
                     y = 40, yend = 42)) +
    geom_segment(aes(x = 138, xend = 138,
                     y = 42, yend = round(st_bbox(df_sf_ja_omit47)[4],  0))) +
    scale_fill_distiller(name = "人口増減(%)",
                         palette = "RdBu", direction = 1) +　　　# 色を赤と青に指定
    theme_map(base_family = "HiraginoSans-W3") +
    theme(legend.position = c(0.8, 0.5),
          legend.title = element_text(size = 55), 
          legend.text = element_text(size = 50),
          legend.key.size = unit(5, "cm"),
          legend.key.width = unit(4.5,"cm")) +
    coord_sf(datum = NA) 

map_pop_jpn_2

人口増減・都道府県別日本地図 (2016)

・プロジェクト内に figというフォルダを作成
→ figを指定して作成したマップを保存
→名前は任意につける：ここでは map_pop_jpn_2と名付けている・作成したマップをfig` に保存する

ggsave("fig/map_pop_jpn_2.png", map_pop_jpn_2, width = 45, height = 45)

9.3 都道府県（市町村）地図

9.3.1. 東京23区の地図データを取得

・2015年度の東京都（23区）の人口のコロプレス地図を作成
・東京都の地図データを取得する
・23区の地図を作成するので、23区のみを取得

・地図を描くために必要なデータ jpndistrict パッケージをロードする

library(jpndistrict)

jpndistrict パッケージに含まれている都道府県別の地図データを読みとる
→ df_sf_jaという名前を付ける
→ df_sf_jaの中身を確認し、東京都の pref_code を確認する

head(df_sf_ja, 23)

Simple feature collection with 23 features and 2 fields
geometry type:  GEOMETRY
dimension:      XY
bbox:           xmin: 135.4483 ymin: 24.2262 xmax: 153.9869 ymax: 45.55792
CRS:            EPSG:4326
# A tibble: 23 x 3
   pref_code prefecture                                                 geometry
   <chr>     <chr>                                                <GEOMETRY [°]>
 1 1         北海道     MULTIPOLYGON (((146.933 44.60533, 146.9484 44.63243, 14…
 2 2         青森県     POLYGON ((139.8576 40.60298, 139.9634 40.68366, 139.996…
 3 3         岩手県     POLYGON ((140.7845 39.02073, 140.808 39.057, 140.8007 3…
 4 4         宮城県     MULTIPOLYGON (((140.4457 38.14469, 140.4789 38.18222, 1…
 5 5         秋田県     POLYGON ((139.6912 39.99377, 139.7012 40.00877, 139.723…
 6 6         山形県     MULTIPOLYGON (((139.5375 38.55854, 139.6085 38.67014, 1…
 7 7         福島県     MULTIPOLYGON (((139.2372 37.187, 139.2196 37.20815, 139…
 8 8         茨城県     POLYGON ((139.7092 36.13537, 139.6902 36.20662, 139.728…
 9 9         栃木県     POLYGON ((139.4055 36.44614, 139.4355 36.46661, 139.416…
10 10        群馬県     POLYGON ((138.524 36.65124, 138.5329 36.6651, 138.5138 …
# … with 13 more rows

・東京都の pref_code は 13 　 → 東京都のデータ＋23区のみ抜き取る

df_tokyo_sf <- jpn_pref(13, district = TRUE) %>% 
  filter(str_detect(city, "区"))

head(df_tokyo_sf, 23)

Simple feature collection with 23 features and 4 fields
geometry type:  GEOMETRY
dimension:      XY
bbox:           xmin: 139.5635 ymin: 35.52088 xmax: 139.919 ymax: 35.81775
CRS:            EPSG:4326
# A tibble: 23 x 5
   pref_code prefecture city_code city                                  geometry
   <chr>     <chr>      <chr>     <chr>                           <GEOMETRY [°]>
 1 13        東京都     13101     千代田区… POLYGON ((139.773 35.70374, 139.7733 3…
 2 13        東京都     13102     中央区 POLYGON ((139.7834 35.69654, 139.789 3…
 3 13        東京都     13103     港区   MULTIPOLYGON (((139.7765 35.6362, 139.…
 4 13        東京都     13104     新宿区 POLYGON ((139.6841 35.6916, 139.6883 3…
 5 13        東京都     13105     文京区 POLYGON ((139.745 35.73593, 139.7492 3…
 6 13        東京都     13106     台東区 POLYGON ((139.7666 35.7132, 139.7644 3…
 7 13        東京都     13107     墨田区 POLYGON ((139.7951 35.70471, 139.8072 …
 8 13        東京都     13108     江東区 MULTIPOLYGON (((139.8337 35.70299, 139…
 9 13        東京都     13109     品川区 MULTIPOLYGON (((139.7724 35.62193, 139…
10 13        東京都     13110     目黒区 POLYGON ((139.6806 35.60435, 139.6809 …
# … with 13 more rows

9.3.2. 東京23区の人口データを取得

・「政府統計の総合窓口(e-Stat)にアクセス
・「地域」をクリック

・「市町村データ」　→　「データ表示」をクリック
・絞り込みの【表示データ】で「現在の市区町村」
・【地域区分】で「東京都」
・【絞り込み】で「特別区」
→　「実行」をクリック

・「全て選択」→　「確定」クリック

・【項目候補】から「A1101 総人口（人）」を選び　→　「項目を選択」をクリック
・「確定」をクリック

・画面右上の「ダウンロード」をクリック

【ダウンロードの範囲】から「ページ上部の選択項目（調査年）」を選ぶ
【ファイル形式】から「csv形式」を選ぶ
【ヘッダの出力】から「出力しない」を選ぶ
【コードの出力】から「出力しない」を選ぶ
✅️データがない行を表示しない
✅️データがない列を表示しない
→　「ダウンロード」をクリック

→　「ダウンロード」をクリック
→　ダウンロードした csv ファイルを開く

・変数名の変更：
「調査年」→ year 「地域」→ 　ku
「A1101_総人口【人】」→ 　population
・「/項目」の行は削除する
→　削除方法は次のとおり　　・ku 内にある「東京都」（東京都と半角スペース）を削除する
・Bにカーソルを当てて変数 ku 列を選び「編集」→「検索」→「置換」
　→　「検索する文字列」に「東京都」という文字と「半角スペース」を入力
　→　「すべて置換」をクリック
・「通知　23件を置換しました」を確認　→　OK をクリック　→　保存

・次のように変更完了したら、プロジェクトフォルダ内に tokyo_pop.csvで保存

9.3.3. 東京都のデータをマージする

・上記のプロセスを経て入手した東京都の人口データ (tokyo_pop.csv) を読み込む

df_tokyo_pop <- read_csv("data/tokyo_pop.csv")

・東京都の地図データ (df_tokyo_sf) の表示

head(df_tokyo_sf)

Simple feature collection with 6 features and 4 fields
geometry type:  GEOMETRY
dimension:      XY
bbox:           xmin: 139.6733 ymin: 35.62304 xmax: 139.8098 ymax: 35.73593
CRS:            EPSG:4326
# A tibble: 6 x 5
  pref_code prefecture city_code city                                   geometry
  <chr>     <chr>      <chr>     <chr>                            <GEOMETRY [°]>
1 13        東京都     13101     千代田区… POLYGON ((139.773 35.70374, 139.7733 35…
2 13        東京都     13102     中央区 POLYGON ((139.7834 35.69654, 139.789 35…
3 13        東京都     13103     港区   MULTIPOLYGON (((139.7765 35.6362, 139.7…
4 13        東京都     13104     新宿区 POLYGON ((139.6841 35.6916, 139.6883 35…
5 13        東京都     13105     文京区 POLYGON ((139.745 35.73593, 139.7492 35…
6 13        東京都     13106     台東区 POLYGON ((139.7666 35.7132, 139.7644 35…

・東京都23区の人口データ (df_tokyo_pop) の表示

head(df_tokyo_pop)

# A tibble: 6 x 3
  year     city            population
  <chr>    <chr>                <dbl>
1 2015年度 東京都 千代田区      58406
2 2015年度 東京都 中央区       141183
3 2015年度 東京都 港区         243283
4 2015年度 東京都 新宿区       333560
5 2015年度 東京都 文京区       219724
6 2015年度 東京都 台東区       198073

・共通の変数 (city) を手がかりに東京の地図データ (df_tokyo_sf) と東京の人口データ (df_tokyo_pop) をマージする

df_tokyo_sf <- df_tokyo_sf %>% 
    left_join(df_tokyo_pop, by = "city") %>% 
    st_as_sf()

df_tokyo_sf %>% 
    head() %>% 
    rmarkdown::paged_table()

ABCDEFGHIJ0123456789

pref_code <chr>	prefecture <chr>	city_code <chr>	city <chr>
13	東京都	13101	千代田区
13	東京都	13102	中央区
13	東京都	13103	港区
13	東京都	13104	新宿区
13	東京都	13105	文京区
13	東京都	13106	台東区

9.3.4. 東京23区の人口別マップ

map_pop_tokyo <- df_tokyo_sf %>% 
    ggplot() +
    geom_sf(aes(fill = population)) +
    scale_fill_distiller(name = "人口",
                         palette = "Greens", direction = 1) +
    theme_map(base_family = "HiraginoSans-W3") +
    theme(legend.position = c(.1, -.1),
          legend.direction = "horizontal",
          legend.title = element_text(size = 15), 
          legend.text = element_text(size = 15),
          legend.key.size = unit(1, "cm"),
          legend.key.width = unit(3,"cm")) +
    coord_sf(datum = NA) 

map_pop_tokyo

・作成したマップを fig に保存する

ggsave("fig/map_pop_tokyo.png", map_pop_tokyo, width = 10, height = 10)

・23区名を表示させる

map_pop_tokyo_text <- df_tokyo_sf %>% 
   mutate(
        text_x = map_dbl(geometry, ~st_centroid(.x)[[1]]),
        text_y = map_dbl(geometry, ~st_centroid(.x)[[2]])
        ) %>% 
  ggplot() +
  geom_sf(aes(fill = population)) +
   geom_label(aes(x = text_x, y = text_y, label = city), 
               size = 1.7, family = "HiraginoSans-W3") +
  scale_fill_distiller(name = "人口",
                       palette = "Greens", direction = 1) +
  theme_map(base_family = "HiraginoSans-W3") +
  theme(legend.position = c(.8, .05),
        legend.title = element_text(size = 10), 
        legend.text = element_text(size = 5),
        legend.key.size = unit(0.5, "cm"),
        legend.key.width = unit(1,"cm")) +
  coord_sf(datum = NA) 

map_pop_tokyo_text

東京23区人口別地図 (2015)

・作成したマップを fig に保存する

ggsave("fig/map_pop_tokyo_text.png", map_pop_tokyo_text, width = 13, height = 13)

9.3.5. 衆院選挙区の地図を描く

library(tidyverse)
library(jpndistrict)
library(sf)
library(ggthemes)
library(rmarkdown)

・Shape ファイルは sf::st_read で読み込む。
・東京大学空間情報科学研究センター西沢明先生が作成した選挙区の地図データを使用する
・西沢明先生の地図データのwebページはこちら

df_dist_map <- sf::st_read("shp/senkyoku289polygon.shp",
                       options = "ENCODING=CP932", 
                       stringsAsFactors = FALSE)

options:        ENCODING=CP932 
Reading layer `senkyoku289polygon' from data source `/Users/asanomasahiko/Dropbox/statistics/class_materials/R/shp/senkyoku289polygon.shp' using driver `ESRI Shapefile'
Simple feature collection with 10443 features and 4 fields
geometry type:  POLYGON
dimension:      XY
bbox:           xmin: 122.9337 ymin: 20.42275 xmax: 153.9868 ymax: 45.52647
CRS:            4612

head(df_dist_map) %>% 
  paged_table()

ABCDEFGHIJ0123456789

	kucode <dbl>	kuname <chr>	ken <dbl>	ku <dbl>
1	108	北海道8区	1	8
2	108	北海道8区	1	8
3	108	北海道8区	1	8
4	108	北海道8区	1	8
5	108	北海道8区	1	8
6	108	北海道8区	1	8

矢内先生の『Rによる計量政治学』のgithubから日本の衆議院議員選挙のデータをダウンロード

df_hr <- read_csv("https://raw.githubusercontent.com/yukiyanai/quant-methods-R/master/data_fixed/hr96-17.csv")

head(df_hr) %>% 
  paged_table()

・読み込んだデータ　df_hr では、選挙区名がローマ字
→　jpndistrict::jpnprefsを使って日本語の都道府県名を取得する

df_pref_en <- jpndistrict::jpnprefs %>% 
    mutate(
        prefecture_en = str_remove_all(prefecture_en, c("-ken" = "",  #-kenken を削除 
                                                        "-to" = "",   #-to を削除 
                                                        "-fu" = "")), #-fu を削除 
        prefecture_en = str_to_lower(prefecture_en),
        prefecture = str_remove_all(prefecture, c("県" = "",  # 県を削除  
                                                  "都" = "",  # 都を削除 
                                                  "府" = "")) # 府を削除 
    ) %>% 
    select(prefecture, ku = prefecture_en) # 変数を 2 つに絞る

df_pref_en %>%
  paged_table()

ABCDEFGHIJ0123456789

prefecture <chr>	ku <chr>
北海道	hokkaido
青森	aomori
岩手	iwate
宮城	miyagi
秋田	akita
山形	yamagata
福島	fukushima
茨城	ibaraki
栃木	tochigi
群馬	gunma

・df_hrとdf_pref_enをマージ　→　日本語の選挙区名を作成
・2017年のみを使用

df_hr17 <- df_hr %>% 
    left_join(df_pref_en, "ku") %>% 
    filter(year == 2017) %>% 
    mutate(
        district = str_c(prefecture, kun, "区") # 県名 + 数字 + 区
    )

df_hr17 %>% 
  paged_table()

ABCDEFGHIJ0123456789

year <dbl>	ku <chr>	kun <dbl>	status <dbl>
2017	aichi	1	1
2017	aichi	1	2
2017	aichi	1	2
2017	aichi	2	1
2017	aichi	2	1
2017	aichi	2	0
2017	aichi	3	1
2017	aichi	3	1
2017	aichi	3	0
2017	aichi	4	1

・2017年衆院選に出馬した候補者年齢の平均を選挙区毎に計算する

df_hr17_age <- df_hr17 %>% 
    group_by(district) %>% 
    summarise(age_mean = mean(age, na.rm = T))

df_dist_mapとdf_hr17_ageをマージする。

head(df_hr17_age)

# A tibble: 6 x 2
  district age_mean
  <chr>       <dbl>
1 愛知10区     60.5
2 愛知11区     54.3
3 愛知12区     49.7
4 愛知13区     58.7
5 愛知14区     44.3
6 愛知15区     40

head(df_dist_map)

Simple feature collection with 6 features and 4 fields
geometry type:  POLYGON
dimension:      XY
bbox:           xmin: 141.007 ymin: 41.71421 xmax: 141.0312 ymax: 41.71973
CRS:            4612
  kucode    kuname ken ku                       geometry
1    108 北海道8区   1  8 POLYGON ((141.0075 41.71472...
2    108 北海道8区   1  8 POLYGON ((141.0082 41.71508...
3    108 北海道8区   1  8 POLYGON ((141.0289 41.71565...
4    108 北海道8区   1  8 POLYGON ((141.031 41.71727,...
5    108 北海道8区   1  8 POLYGON ((141.0286 41.71796...
6    108 北海道8区   1  8 POLYGON ((141.0306 41.71943...

‘df_dist_map’ の ‘kuname’ => ‘district’ に変更する

names(df_dist_map)[2] <- "district"

names(df_dist_map)

[1] "kucode"   "district" "ken"      "ku"       "geometry"

df_dist_map_age <- df_dist_map %>% 
    left_join(df_hr17_age, by = "district") %>% 
    st_as_sf()

・静岡県の選挙区の候補者の平均年齢の地図を作成する

map_dist_sizuoka_age <- df_dist_map_age %>% 
    filter(str_detect(district, "静岡")) %>% 
    ggplot() +
    geom_sf(aes(fill = age_mean)) +
    scale_fill_distiller(name = "候補者の年齢の平均",
                         palette = "YlOrRd", direction = 1) +
    theme_map(base_family = "HiraginoSans-W3") +
    theme(legend.position = c(.1, -.1),
          legend.direction = "horizontal",
          legend.title = element_text(size = 15), 
          legend.text = element_text(size = 15),
          legend.key.size = unit(1, "cm"),
          legend.key.width = unit(3,"cm")) +
    coord_sf(datum = NA) 

map_dist_sizuoka_age

衆院小選挙区・候補者の年齢別地図：静岡県 (2015)

・作成したマップを fig に保存する

ggsave("fig/map_dist_sizuoka_age.png", map_dist_sizuoka_age,
       width = 10, height = 10)

・地図に選挙区名を加えたい　→　map_dblで座標の重心を設定

map_dist_sizuoka_text <- df_dist_map_age %>% 
    filter(str_detect(district, "静岡")) %>% 
    mutate(
        text_x = map_dbl(geometry, ~st_centroid(.x)[[1]]),
        text_y = map_dbl(geometry, ~st_centroid(.x)[[2]]),
        district = str_remove_all(district, "静岡")
        ) %>% 
    ggplot() +
    geom_sf(aes(fill = age_mean)) +
    geom_label(aes(x = text_x, y = text_y, label = district), 
               size = 5, family = "HiraginoSans-W3") +
    scale_fill_distiller(name = "候補者の年齢の平均",
                         palette = "YlOrRd", direction = 1) +
    theme_map(base_family = "HiraginoSans-W3") +
    theme(legend.position = c(.1, -.1),
          legend.direction = "horizontal",
          legend.title = element_text(size = 15), 
          legend.text = element_text(size = 15),
          legend.key.size = unit(1, "cm"),
          legend.key.width = unit(3,"cm")) +
    coord_sf(datum = NA) 

map_dist_sizuoka_text

**衆院小選挙区・候補者の年齢別地図：静岡県 (2015)

・作成したマップを fig に保存する

ggsave("fig/map_dist_sizuoka_text.png",map_dist_sizuoka_text,
       width = 10, height = 10)

10. ドットプロット

*2009年に実施された衆議院選挙（小選挙区）の結果を加工したものを使う次のデータセットをダウンロードし、RProjctフォルダー内に保存する

hr09_ldp_seatshare.csv

・上記サイトから2009年に実施された衆議院選挙結果データ (hr09_ldp_seatshare.csv) をダウンロード
→　プロジェクトフォルダ内に作った ‘data’ フォルダの中に入れる
・この csv ふぁいるを読みみ込み、hr09_ldpと名前をつけ、DT packages の datatable()関数を使ってインターアクティブなデータフレームを表示させる。

hr09_ldp <- read_csv("data/hr09_ldp_seatshare.csv",
                         locale = locale(encoding = "cp932"))

DT::datatable(hr09_ldp)

Show entries

Search:

	year	pref	id	nosmd	dpj
1	2009	aichi	1	15	15
2	2009	akita	2	3	2
3	2009	fukushima	8	5	5
4	2009	iwate	16	4	4
5	2009	nagano	26	5	5
6	2009	nagasaki	27	4	4
7	2009	niigata	29	6	6
8	2009	oita	30	3	2
9	2009	okinawa	32	4	2
10	2009	saitama	35	15	14

Showing 1 to 10 of 47 entries

Previous1 2 3 4 5Next

・このデータセットには、次の7つの変数が含まれている

year : 衆議院選挙が実施された年
id : 都道府県の id (1-47)
prefecture : 都道府県名（日本語）
pref : 都道府県名（英語）
nosmd：都道府県内の小選挙区総数 (1-25)
ldp : 都道府県内の小選挙区で自民党候補者の当選総数
ldp_ratio : 都道府県内の小選挙区総数に占める自民党当選者の割合(%)
dpj : 都道府県内の小選挙区で民主党候補者の当選総数

10.1. 基本的なドットプロット　　

ドットプロット(dot plot)は視覚的に無駄がなく読みやすいため、棒グラフやヒストグラムの代わりに使われる
最も基本的なドットプロットは次のように表示できる

ggplot(hr09_ldp, aes(x = ldp_ratio, y = pref)) + geom_point()

都道府県がアルファベット順に下から並んでいることがわかる
reorder( )関数を使って自民党当選者の割合 (ldp_ratio) 順にソートしてより見やすいドットプロットを描く

ggplot(hr09_ldp, aes(x = ldp_ratio, y = reorder(pref, ldp_ratio))) +
  geom_point(size = 2) +
  theme_bw() +
  theme(panel.grid.major.x = element_blank(),
        panel.grid.minor.x = element_blank(),
        panel.grid.major.y = element_line(colour = "grey60", linetype = "dashed"))

次のようにx 軸と y 軸を入れ替えて表示することもできる

ggplot(hr09_ldp, aes(x = reorder(pref, ldp_ratio), y = ldp_ratio)) +
  geom_point(size = 2) +
  theme_bw() +
  theme(axis.text.x = element_text(angle = 60, hjust = 1),
        panel.grid.major.y = element_blank(),
        panel.grid.minor.y = element_blank(),
        panel.grid.major.x = element_line(colour = "grey60", linetype = "dashed"))

10.2 グループ分けしたドットプロット　　

次に facet( )関数を使って、2009年、2012年、2014年の三回の衆議院選挙において都道府県ごとに自民党が占める議席率がどのように変化してきたかドットプロットを使って表示してみる

2009年~2014年に実施された三回の衆議院選挙結果データ (hr09_14_ldp_seatshare.csv) を読み込み、LDPと名前をつけ、最初の6行と最後の6行を表示させる

hr09_14_ldp_seatshare.csv

LDP <- read_csv("hr09_14_ldp_seatshare.csv",
                         locale = locale(encoding = "cp932"))

DTパッケージの datatable 関数を使うと、インターアクティブなデータの記述統計を表示できる。

DT::datatable(LDP)

Show entries

Search:

	year	pref	id	nosmd	ldp	ldp_ratio	dpj
1	2009	fukui	6	3	3	100	0
2	2009	kochi	20	3	3	100	0
3	2009	shimane	37	2	2	100	0
4	2009	tottori	42	2	2	100	0
5	2009	aomori	3	4	3	75	1
6	2009	ehie	5	4	3	75	1
7	2009	yamaguchi	46	4	3	75	1
8	2009	miyazaki	25	3	2	67	0
9	2009	toyama	43	3	2	67	1
10	2009	kagoshima	18	5	3	60	1

Showing 1 to 10 of 141 entries

Previous1 2 3 4 5…15Next

・選挙年(year)ごとに表示するため、year 変数を factor に変換後、2009年の衆議院選挙で自民党が占める議席率をソートし、三回の選挙ごとに自民党が当選した議席率を都道府県ごとに表示する。

# year変数を factor に変換する
LDP$year <- factor(LDP$year) 

# order_seqという変数を加えて LDP_reordered というデータフレーム名をつける
LDP_reordered <- LDP %>%
  arrange(year, ldp_ratio) %>%
  mutate(order_seq = c(1:47, rep(0, 47*2))) 

ggplot(LDP_reordered, aes(x=ldp_ratio, y=reorder(pref, order_seq))) + 
  geom_segment(aes(yend=pref),
               xend=0, colour="grey50") +
  geom_point(size=2,
             aes(colour=year)) +
  scale_colour_brewer(palette="Set1", limits=c("2009", "2012", "2014"),
                      guide=FALSE) +
  theme(panel.grid.major.y =
          element_blank()) +
  facet_grid(~ year,
             scales="free_y", space="free_y")

2009-2014年の衆議院選挙（小選挙区）で自民党候補者が当選した割合（都道府県別）

・ヒストグラムで表すと次の様になる

## データセットを読み込む
LDP <- read_csv("data/hr09-14_ldp_seatshare.csv")

# year変数を factor に変換する
LDP$year <- factor(LDP$year)

LDP_reordered <- LDP %>%
  arrange(year, ldp_ratio) %>%
  mutate(order_seq = c(1:47, rep(0, 47*2)))

ggplot(LDP_reordered, aes(x = reorder(pref, order_seq), y = ldp_ratio,  fill = year)) +
  geom_bar(stat = "identity") +
  facet_grid(~ year, scales = "free_x") +
  theme(legend.position = "none") +
  labs(x = "Prefecture", y = "LDP's seat share (%)") +
  coord_flip()

2009-2014年の衆議院選挙（小選挙区）で自民党候補者が当選した割合（都道府県別）

・自民党から民主党に政権交代が起こった2009年の衆議院選挙では、鳥取、島根、高知、福井の四つの県内全ての小選挙区で自民党議員が全員当選している

他方、静岡、大分、新潟、長崎、秋田の五つの県では、2009年の政権交代選挙において、県内全ての小選挙区で自民党議員は全滅したが、2012年の安倍政権への政権交代選挙では自民党候補者が100％当選している

Reference

Kieran Healy, DATA VISUALIZATION, Princeton, 2019
浅野正彦, 矢内勇生.『Rによる計量政治学』オーム社、2018年
浅野正彦, 中村公亮.『初めてのRStudio』オーム社、2018年
松村他『RユーザのためのRStudio[実践]入門』、2018年
Winston Chang, R Graphics Cookbook, O’Reilly Media, 2012.
Kosuke Imai, Quantitative Social Science: An Introduction, Princeton University Press, 2017

実証分析の研究者が担当しているサイト教材:

・因果推論のための計量経済学 (黒川博文先生のサイト＠兵庫県立大学)
・Kosuke Imai’s Teaching（今井耕介先生の授業教材＠ハーバード大学)
・Jaehyun SONG（宋財泫先生の授業教材＠同志社大学）
・Rで計量政治学入門（土井翔平先生の授業教材＠一橋大学）
・UNBOUNDELY・(KRSKさんのサイト＠ハーバード大学)
・Yuki YANAI（矢内勇生先生の授業教材＠高知工科大学）
・Yusuke TSUGAWA（津川友介先生による統計学＠UCLA）

3.2.Data Visualization [2]

Masahiko Asano

2020-11-24

8. 折れ線グラフ

8.1 日本人と中国人の平均寿命

Gapminder

平均寿命の時系列データ（日本人）

平均寿命の時系列データ（日本人と中国人）

8.1 Exercise

8.2 自民党と民進党の当選者年齢

`dplyr package`を使った age.median の計算

8.3 Exercise

8.4 米国下院の政治的二極化（経済的次元)

`tidyverse` を使った congress.median の計算

8.4 Exercise

9. コロプレス地図

(1) 地図データ

(2) 人口データ

9.1. 世界地図

9.1.1. 世界地図データ

9.1.2. 国別人口データ(2016年)

9.1.3. 地図データと人口データのマージ

9.1.4. 世界地図を描く

9.2. 日本地図

9.2.1. 日本地図データ

9.2.2. 日本人口データ(2016年)

9.2.3. 地図データと人口データのマージ

9.2.4. 日本地図を描く

9.2.5. 都道府県別の人口増減マップ

9.3 都道府県（市町村）地図

9.3.1. 東京23区の地図データを取得

9.3.2. 東京23区の人口データを取得

9.3.3. 東京都のデータをマージする

9.3.4. 東京23区の人口別マップ

9.3.5. 衆院選挙区の地図を描く

10. ドットプロット

10.1. 基本的なドットプロット

10.2 グループ分けしたドットプロット

3.2.Data Visualization [2]

Masahiko Asano

2020-11-24

8. 折れ線グラフ

8.1 日本人と中国人の平均寿命

Gapminder

平均寿命の時系列データ（日本人）

平均寿命の時系列データ（日本人と中国人）

8.1 Exercise

8.2 自民党と民進党の当選者年齢

dplyr packageを使った age.median の計算

8.3 Exercise

8.4 米国下院の政治的二極化（経済的次元)

tidyverse を使った congress.median の計算

8.4 Exercise

9. コロプレス地図

(1) 地図データ

(2) 人口データ

9.1. 世界地図

9.1.1. 世界地図データ

9.1.2. 国別人口データ(2016年)

9.1.3. 地図データと人口データのマージ

9.1.4. 世界地図を描く

9.2. 日本地図

9.2.1. 日本地図データ

9.2.2. 日本人口データ(2016年)

9.2.3. 地図データと人口データのマージ

9.2.4. 日本地図を描く

9.2.5. 都道府県別の人口増減マップ

9.3 都道府県（市町村）地図

9.3.1. 東京23区の地図データを取得

9.3.2. 東京23区の人口データを取得

9.3.3. 東京都のデータをマージする

9.3.4. 東京23区の人口別マップ

9.3.5. 衆院選挙区の地図を描く

10. ドットプロット

10.1. 基本的なドットプロット

10.2 グループ分けしたドットプロット

`dplyr package`を使った age.median の計算

`tidyverse` を使った congress.median の計算

9.2. 日本地図　　

10.1. 基本的なドットプロット　　

10.2 グループ分けしたドットプロット