{"id":7649,"date":"2015-06-16T09:28:42","date_gmt":"2015-06-16T00:28:42","guid":{"rendered":"http:\/\/www.techscore.com\/blog\/?p=7649"},"modified":"2018-11-14T16:33:47","modified_gmt":"2018-11-14T07:33:47","slug":"dmm","status":"publish","type":"post","link":"https:\/\/www.techscore.com\/blog\/2015\/06\/16\/dmm\/","title":{"rendered":"\u6df7\u5408\u30c7\u30a3\u30ea\u30af\u30ec\u5206\u5e03\u3067\u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0"},"content":{"rendered":"<p>\u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0\u306b\u306f\u69d8\u3005\u306a\u624b\u6cd5\u304c\u3042\u308a\u307e\u3059\u304c\u3001\u76ee\u7684\u3001\u30c7\u30fc\u30bf\u306e\u5206\u5e03\u306a\u3069\u306b\u5408\u308f\u305b\u3066\u9069\u5207\u306a\u3082\u306e\u3092\u9078\u629e\u3057\u307e\u3059<sup>(1)<\/sup>.<br \/>\n\u4eca\u306f\u5f37\u529b\u306a\u30c4\u30fc\u30eb\u3001\u30e9\u30a4\u30d6\u30e9\u30ea\u304c\u63c3\u3063\u3066\u3044\u307e\u3059\u306e\u3067\u5927\u4f53\u306f\u305d\u308c\u3067\u307e\u304b\u306a\u3048\u3066\u3057\u307e\u3044\u307e\u3059\u3057\u3001\u69d8\u3005\u306a\u9ad8\u5ea6\u306a\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3092\u3059\u3050\u306b\u4f7f\u3046\u3053\u3068\u304c\u3067\u304d\u307e\u3059<sup>(2)<\/sup>.<br \/>\n\u3057\u304b\u3057\u3001\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u306b\u3088\u3063\u3066\u306f\u63d0\u4f9b\u3055\u308c\u3066\u3044\u306a\u3044\u3053\u3068\u3082\u3042\u308a\u307e\u3059\u306e\u3067\u3001\u5b9f\u88c5\u304c\u3067\u304d\u308b\u3068\u5e45\u304c\u5e83\u304c\u308a\u307e\u3059.<\/p>\n<p>\u4eca\u56de\u7d39\u4ecb\u3059\u308b\u300c\u6df7\u5408\u30c7\u30a3\u30ea\u30af\u30ec\u5206\u5e03\u300d\u306f\u904b\u60aa\u304f\u3001\u79c1\u306e\u3088\u304f\u4f7f\u3046 <a href=\"http:\/\/scikit-learn.org\/stable\/modules\/clustering.html#clustering\" target=\"_blank\">sckit-learn<\/a> \u306b\u306a\u3044\u305f\u3081\u3001\u9811\u5f35\u3063\u3066\u5b9f\u88c5\u3057\u3066\u307f\u3088\u3046\u3068\u601d\u3044\u307e\u3059(\u79c1\u81ea\u8eab\u3001\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3092\u5b9f\u88c5\u304b\u3089\u3059\u308b\u306e\u306f\u4e45\u3005\u3067\u3059..).<\/p>\n<h3>\u6df7\u5408\u30c7\u30a3\u30ea\u30af\u30ec\u5206\u5e03<\/h3>\n<p>\u4eca\u56de\u306f\u3001\u300c\u8db3\u3057\u305f\u3089 1 \u306b\u306a\u308b\u30c7\u30fc\u30bf\u300d\u306e\u96c6\u5408\u306b\u5bfe\u3059\u308b\u5206\u5e03\u8868\u73fe\u30fb\u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0\u624b\u6cd5\u3092\u8003\u3048\u3066\u307f\u307e\u3057\u305f.<br \/>\n\u3053\u306e\u3088\u3046\u306a\u30c7\u30fc\u30bf\u306b\u306f\u4f8b\u3048\u3070\u3001\u5f53\u793e\u306e\u9867\u5ba2\u4fa1\u5024\u89b3\u30e2\u30c7\u30eb\u300c<a href=\"https:\/\/www.synergy-marketing.co.jp\/cloud\/insightbox\/societas\/\" target=\"_blank\">Societas<\/a>\u300d\u304c\u3042\u308a\u307e\u3059.<br \/>\nSocietas \u3067\u306f\u4eba\u306e\u4fa1\u5024\u89b3\u3092 12 \u306e\u7279\u6027\u30d1\u30bf\u30fc\u30f3\u306e\u5272\u5408(\u6240\u5c5e\u78ba\u7387)\u3067\u8868\u73fe\u3057\u307e\u3059.\u5177\u4f53\u7684\u306b\u306f\u4f8b\u3048\u3070\u3042\u308b\u4eba\u306e\u4fa1\u5024\u89b3\u3092\u3001<\/p>\n<blockquote><p>\n[\u30d1\u30bf\u30fc\u30f3 2-2 \u5bb6\u5ead\u7684\u306a\u771f\u9762\u76ee\u30bf\u30a4\u30d7]:60%<br \/>\n[\u30d1\u30bf\u30fc\u30f3 2-1 \u5bb6\u65cf\u5927\u597d\u304d\u60a0\u3005\u30bf\u30a4\u30d7]:15%<br \/>\n[\u30d1\u30bf\u30fc\u30f3 4-1 \u81ea\u5206\u4e2d\u5fc3\u7684\u306a\u30a2\u30af\u30c6\u30a3\u30d6\u30bf\u30a4\u30d7]:10%<br \/>\n[\u30d1\u30bf\u30fc\u30f3 5-2 \u793e\u4ea4\u7684\u306a\u5805\u5b9f\u30db\u30fc\u30e0\u30e1\u30fc\u30ab\u30fc\u30bf\u30a4\u30d7]:10%<br \/>\n...\n<\/p><\/blockquote>\n<p>\u306e\u3088\u3046\u306b\u8868\u73fe\u3057\u307e\u3059\u304c\u3001\u3053\u3046\u3044\u3063\u305f\u30c7\u30fc\u30bf\u306b\u5206\u5e03\u306b\u3042\u3066\u306f\u3081\u308b\u5834\u5408\u3067\u3059.<br \/>\n\u4f8b\u3048\u3070\u3001\u4e0a\u306e\u4f8b\u306e\u30bd\u30b7\u30a8\u30bf\u30b9\u306e\u5272\u5408\u304b\u3089\u3067\u304d\u308b\u4fa1\u5024\u89b3\u30a4\u30e1\u30fc\u30b8\u306f\u300c\u5b9f\u7528\u6027\u3092\u91cd\u8996\u3057\u305f\u3001\u3057\u3063\u304b\u308a\u6d3e\u300d\u3067\u3001\u4e0b\u56f3\u306e\u3088\u3046\u306a\u30d1\u30a4\u30c1\u30e3\u30fc\u30c8\u3067\u8868\u73fe\u3057\u305f\u308a\u3057\u307e\u3059.<br \/>\n<a href=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/societas_cluster.png\" rel=\"facebox\" rel=\"attachment wp-att-7776\"><img loading=\"lazy\" src=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/societas_cluster-300x195.png\" alt=\"societas_cluster\" width=\"300\" height=\"195\" class=\"alignnone size-medium wp-image-7776\" srcset=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/societas_cluster-300x195.png 300w, https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/societas_cluster.png 548w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>\u3055\u3066\u3001\u3053\u306e\u3088\u3046\u306a\u30c7\u30fc\u30bf\u306e\u5206\u5e03\u30e2\u30c7\u30eb\u3068\u3057\u3066\u306f<a href=\"https:\/\/en.wikipedia.org\/wiki\/Dirichlet_distribution\" target=\"_blank\">\u30c7\u30a3\u30ea\u30af\u30ec\u5206\u5e03<\/a>\u304c\u3042\u308a\u3001\u300c\u30b5\u30a4\u30b3\u30ed\u306e\u51fa\u76ee\u306e\u3067\u3084\u3059\u3055\u300d\u306e\u78ba\u7387\u5206\u5e03\u3068\u8a00\u308f\u308c\u3001\u3044\u3073\u3064\u306a\u30b5\u30a4\u30b3\u30ed\u3067\u78ba\u7387\u5206\u5e03\u3092\u8868\u73fe\u3059\u308b\u30a4\u30e1\u30fc\u30b8\u306b\u306a\u308a\u307e\u3059\uff08\uff1d\u3042\u308b\u9762\u304c\u51fa\u3084\u3059\u304b\u3063\u305f\u308a\u51fa\u306b\u304f\u304b\u3063\u305f\u308a\u3057\u307e\u3059\u304c\u3001\u51fa\u76ee\u306e\u78ba\u7387\u306e\u548c\u306f 1 \uff09.<\/p>\n<p>\u4eca\u56de\u306f\u3053\u306e\u30c7\u30a3\u30ea\u30af\u30ec\u5206\u5e03\u3092\u8907\u6570\u91cd\u306d\u3042\u308f\u305b\u3066\u30c7\u30fc\u30bf\u5206\u5e03\u3092\u8868\u73fe\u3057\u307e\u3059.<br \/>\n\u3068\u8a00\u3063\u3066\u3082\u3001\u5b9f\u969b\u306b\u306f\u6df7\u5408\u30ac\u30a6\u30b9\u5206\u5e03\u30e2\u30c7\u30eb\u306e\u30ac\u30a6\u30b9\u5206\u5e03\u3092\u30c7\u30a3\u30ea\u30af\u30ec\u5206\u5e03\u306b\u7f6e\u304d\u63db\u3048\u308b\u3060\u3051\u3067\u3059<sup>(3)<\/sup>.<\/p>\n<h3>\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0<\/h3>\n<p>\u307e\u305a\u3001\u30c7\u30fc\u30bf\u306e\u5206\u5e03\u3092\u8907\u6570\u306e\u30c7\u30a3\u30ea\u30af\u30ec\u5206\u5e03\u306e\u91cd\u306d\u5408\u308f\u305b\u3067\u5b9a\u5f0f\u5316\u3057\u307e\u3059. \u03b1 \u306f\u30c7\u30a3\u30ea\u30af\u30ec\u5206\u5e03\u30d1\u30e9\u30e1\u30fc\u30bf\u3001\u03c0 \u306f\u6df7\u5408\u6bd4\u3068\u547c\u3070\u308c\u308b\u6f5c\u5728\u5909\u6570\u3001k \u306f\u91cd\u306d\u5408\u308f\u305b\u308b\u5206\u5e03\u306e\u6570\u3067\u3059.<\/p>\n<p><img src='http:\/\/l.wordpress.com\/latex.php?latex=p%28x%29%20%3D%20%5Csum_%7Bk%3D1%7D%5EK%20%5Cpi_k%20%5C%20dirichlet%28x%7C%5Calpha_k%29%20&bg=FFFFFF&fg=000000&s=2' title='p(x) = \\sum_{k=1}^K \\pi_k \\ dirichlet(x|\\alpha_k) ' style='vertical-align:1%' class='tex' alt='p(x) = \\sum_{k=1}^K \\pi_k \\ dirichlet(x|\\alpha_k) ' \/><\/p>\n<p>\u30c7\u30fc\u30bf\u306e\u5206\u5e03\u3092\u8868\u73fe\u3067\u304d\u305f\u3089\u305d\u306e\u5c24\u5ea6\u3092\u6700\u5927\u5316\u3059\u308b\u3053\u3068\u3067\u6700\u9069\u306a\u30d1\u30e9\u30e1\u30fc\u30bf\u3092\u898b\u3064\u3051\u308b\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u304c\u3001\u4eca\u56de\u306f\u6df7\u5408\u30ac\u30a6\u30b9\u5206\u5e03\u3068\u540c\u69d8\u306b EM \u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3067\u898b\u3064\u3051\u307e\u3059.<br \/>\n\u7570\u306a\u308b\u3068\u3053\u308d\u306f\u3001M \u30b9\u30c6\u30c3\u30d7\u3067\u30c7\u30a3\u30ac\u30f3\u30de\u95a2\u6570\u306e\u9006\u95a2\u6570\u3092\u30cb\u30e5\u30fc\u30c8\u30f3\u6cd5\u3067\u89e3\u304f\u3068\u3053\u308d\u3060\u3051\u3067\u3059.<\/p>\n<ol>\n<li>E \u30b9\u30c6\u30c3\u30d7<br \/>\n\u5404\u6df7\u5408\u8981\u7d20\u306e\u8ca0\u62c5\u7387\u3092\u8a08\u7b97\u3057\u307e\u3059.<\/p>\n<p><img src='http:\/\/l.wordpress.com\/latex.php?latex=r_%7Bnk%7D%20%3D%20%5Cfrac%20%7B%5Cpi_k%20%5C%20p%28x_n%20%7C%20%5Ctheta_k%29%7D%20%7B%5Csum_%7Bc%3D1%7D%5EK%20%5C%20%5Cpi_c%5C%20p%28x_n%20%7C%20%5Ctheta_c%29%7D%20&bg=FFFFFF&fg=000000&s=2' title='r_{nk} = \\frac {\\pi_k \\ p(x_n | \\theta_k)} {\\sum_{c=1}^K \\ \\pi_c\\ p(x_n | \\theta_c)} ' style='vertical-align:1%' class='tex' alt='r_{nk} = \\frac {\\pi_k \\ p(x_n | \\theta_k)} {\\sum_{c=1}^K \\ \\pi_c\\ p(x_n | \\theta_c)} ' \/><\/p>\n<\/li>\n<pre class=\"lang:python decode:true\">\r\ndef dirichlet_pdf(x, alpha):\r\n    \"\"\"\r\n    \u30c7\u30a3\u30ea\u30af\u30ec\u5206\u5e03\u78ba\u7387\u5bc6\u5ea6\u95a2\u6570\r\n    \"\"\"\r\n    return reduce(operator.mul, [x[d]**(alpha[d]-1.0) for d in range(len(alpha))]) * \\\r\n        math.gamma(sum(alpha)) \/ \\\r\n        reduce(operator.mul, [math.gamma(a) for a in alpha])\r\n\r\ndef E_step(K, X, alpha, pi, resp):\r\n    \"\"\"\r\n    \u30c7\u30fc\u30bf xi \u304c \u30af\u30e9\u30b9 C \u306b\u6240\u5c5e\u3059\u308b\u78ba\u7387 responsibility\r\n    [[estep.png]]\r\n    \"\"\"\r\n    for k in range(K):\r\n        \r\n        def f(x):\r\n            return pi[k] * dirichlet_pdf(x, alpha[k]) \/ \\\r\n                sum([pi[c] * dirichlet_pdf(x, alpha[c]) for c in range(K)])\r\n                \r\n        resp[:, k] = np.apply_along_axis(f, 1, X)\r\n\r\n<\/pre>\n<li>M \u30b9\u30c6\u30c3\u30d7<br \/>\n\u73fe\u5728\u306e\u8ca0\u62c5\u7387\u3092\u4f7f\u3063\u3066\u3001\u30d1\u30e9\u30e1\u30fc\u30bf\u5024\u3092\u518d\u8a08\u7b97\u3057\u307e\u3059\u3002<br \/>\nQ \u95a2\u6570\u306f\u6df7\u5408\u30ac\u30a6\u30b9\u5206\u5e03\u3068\u540c\u69d8\u3067\u3059\u304c\u3001\u30ac\u30a6\u30b9\u5206\u5e03\u3092\u30c7\u30a3\u30ea\u30af\u30ec\u5206\u5e03\u3068\u7f6e\u304d\u63db\u3048\u307e\u3059\u3002<\/p>\n<p><img src='http:\/\/l.wordpress.com\/latex.php?latex=Q%28%5Ctheta%2C%20%5Ctheta%5E%7Bold%7D%29%20%3D%20%5Csum_%7Bn%3D1%7D%5EN%20%5C%20%5Csum_%7Bk%3D1%7D%5EK%20r_%7Bnk%7D%5C%7B%20log%5C%20%5Cpi_k%20%5C%20dirichlet%28x_n%20%7C%20%5Calpha_k%29%20%5C%7D&bg=FFFFFF&fg=000000&s=2' title='Q(\\theta, \\theta^{old}) = \\sum_{n=1}^N \\ \\sum_{k=1}^K r_{nk}\\{ log\\ \\pi_k \\ dirichlet(x_n | \\alpha_k) \\}' style='vertical-align:1%' class='tex' alt='Q(\\theta, \\theta^{old}) = \\sum_{n=1}^N \\ \\sum_{k=1}^K r_{nk}\\{ log\\ \\pi_k \\ dirichlet(x_n | \\alpha_k) \\}' \/><\/p>\n<p>\u3053\u308c\u306f\u51f8\u95a2\u6570\u3067\u3059\u306e\u3067\u30d1\u30e9\u30e1\u30fc\u30bf\u3067\u504f\u5fae\u5206\u3057\u3001\u6700\u5927\u5316\u3059\u308b\u3053\u3068\u3067\u65b0\u3057\u3044\u30d1\u30e9\u30e1\u30fc\u30bf\u3068\u7f6e\u304d\u63db\u3048\u307e\u3059\u304c\u3001<\/p>\n<p><img src='http:\/\/l.wordpress.com\/latex.php?latex=%5Calpha_%7Bkd%7D%20%3D%20%5Cpsi%5E%7B-1%7D%20%5Cbigl%28%20%5Cpsi%28%5Csum_%7Bd%3D1%7D%5ED%20%5Calpha_%7Bkd%7D%29%20%2B%20%5Cfrac%20%7B%5Csum_%7Bn%3D1%7D%5EN%20r_%7Bnk%7D%20%5Clog%20x_%7Bnd%7D%7D%7B%5Csum_%7Bn%3D1%7D%5EN%20r_%7Bnk%7D%7D%20%5Cbigr%29&bg=FFFFFF&fg=000000&s=2' title='\\alpha_{kd} = \\psi^{-1} \\bigl( \\psi(\\sum_{d=1}^D \\alpha_{kd}) + \\frac {\\sum_{n=1}^N r_{nk} \\log x_{nd}}{\\sum_{n=1}^N r_{nk}} \\bigr)' style='vertical-align:1%' class='tex' alt='\\alpha_{kd} = \\psi^{-1} \\bigl( \\psi(\\sum_{d=1}^D \\alpha_{kd}) + \\frac {\\sum_{n=1}^N r_{nk} \\log x_{nd}}{\\sum_{n=1}^N r_{nk}} \\bigr)' \/><\/p>\n<p>\u3053\u306e\u6642\u3001\u30c7\u30a3\u30ac\u30f3\u30de\u95a2\u6570\u306e\u9006\u95a2\u6570\u3092\u30cb\u30e5\u30fc\u30c8\u30f3\u6cd5\u3067\u89e3\u304d\u307e\u3059<sup><a href=\"http:\/\/research.microsoft.com\/en-us\/um\/people\/minka\/papers\/dirichlet\/minka-dirichlet.pdf\" target=\"_blank\">(4)<\/a><\/sup>.<\/p>\n<p><img src='http:\/\/l.wordpress.com\/latex.php?latex=x%5E0%20%3D%20%5Cpsi%5E%7B-1%7D%28y%29%20%3D%20%5Cleft%5C%7B%5Cbegin%7Baligned%7D%20%20%20%20exp%28y%29%20%2B%201%2F2%20%26%3A%20if%5C%20y%20%5Cgeq%20-2.22%20%5C%5C%20%20%20%20-1%2F%28y-%5Cpsi%281%29%29%20%26%3A%20if%5C%20y%20%3C%20-2.22%5Cend%7Baligned%7D%5Cright.%20%5C%5Cx%5E%7Bt%2B1%7D%20%3D%20x%5E%7Bt%7D%20-%20%5Cfrac%7B%5Cpsi%28x%29%20-%20y%7D%7B%5Cpsi%5E%7B%27%7D%28x%29%7D&bg=FFFFFF&fg=000000&s=2' title='x^0 = \\psi^{-1}(y) = \\left\\{\\begin{aligned}    exp(y) + 1\/2 &: if\\ y \\geq -2.22 \\\\    -1\/(y-\\psi(1)) &: if\\ y < -2.22\\end{aligned}\\right. \\\\x^{t+1} = x^{t} - \\frac{\\psi(x) - y}{\\psi^{'}(x)}' style='vertical-align:1%' class='tex' alt='x^0 = \\psi^{-1}(y) = \\left\\{\\begin{aligned}    exp(y) + 1\/2 &: if\\ y \\geq -2.22 \\\\    -1\/(y-\\psi(1)) &: if\\ y < -2.22\\end{aligned}\\right. \\\\x^{t+1} = x^{t} - \\frac{\\psi(x) - y}{\\psi^{'}(x)}' \/><\/p>\n<\/li>\n<pre class=\"lang:python decode:true\">\r\n\r\nfrom scipy.special import polygamma\r\ndigamma  = lambda x: polygamma(0, x)\r\ntrigamma = lambda x: polygamma(1, x)\r\n\r\ndef inverse_digamma(y):\r\n    \"\"\"\r\n    \u30cb\u30e5\u30fc\u30c8\u30f3\u6cd5\u306b\u3088\u308b\u30c7\u30a3\u30ac\u30f3\u30de\u95a2\u6570\u306e\u9006\u95a2\u6570\u89e3\u6cd5\r\n    \"\"\"\r\n    if y >= -2.22:\r\n        x = np.exp(y) + 0.5\r\n    else:\r\n        x = -1.\/(y+(-digamma(1)))\r\n\r\n    while True:\r\n        x_new = x - (digamma(x) - y) \/ trigamma(x)\r\n\r\n        if abs(x_new - x) < 1.0e-8:\r\n             break\r\n\r\n        x = x_new\r\n             \r\n    return x_new\r\n\r\n\r\ndef M_step(K, X, alpha, pi, resp):\r\n    \"\"\"\r\n    Q \u95a2\u6570\u306e\u6700\u5927\u5316\r\n    \"\"\"\r\n    alpha_new = alpha\r\n\r\n    # alpha update\r\n    for k in range(K):\r\n        for d in range(alpha.shape[1]):\r\n            \"\"\"\r\n            [[mstep_inv_digamma.png]]\r\n            \"\"\"\r\n            y = digamma(sum(alpha[k])) + (sum(resp[:, k] * np.log(X[:,d])) \/ sum(resp[:, k]))\r\n\r\n            alpha_new[k][d] = inverse_digamma(y)\r\n\r\n    # pi update\r\n    for k in range(K):\r\n        pi[k] = resp[:, k].sum() \/ X.shape[0]\r\n\r\n<\/pre>\n<li>E \u3068 M \u306e\u30b9\u30c6\u30c3\u30d7\u3092\u7e70\u308a\u8fd4\u3057\u3001\u5bfe\u6570\u5c24\u5ea6\u304c\u53ce\u675f\u3057\u305f\u3068\u304d\u306b\u5f97\u3089\u308c\u308b \u03b1\u3001\u03c0 \u304c\u6700\u9069\u306a\u30d1\u30e9\u30e1\u30fc\u30bf\u3001\u6f5c\u5728\u5909\u6570\u3068\u306a\u308a\u307e\u3059.\n<pre class=\"lang:python decode:true\">\r\n\r\n# E-M \u30b9\u30c6\u30c3\u30d7\u30fb\u30a4\u30c6\u30ec\u30fc\u30b7\u30e7\u30f3\r\nfor nstep in range(200):\r\n    \r\n    ###     \r\n    # [E-step]\r\n    E_step(K, X, alpha, pi, resp)\r\n\r\n    ### \r\n    # [M-step]\r\n    M_step(K, X, alpha, pi, resp)\r\n\r\n\r\n<\/pre>\n<\/li>\n<\/ol>\n<h3>\u5b9f\u969b\u306b\u3084\u3063\u3066\u307f\u308b<\/h3>\n<p>\u3055\u3066\u3001\u3053\u306e\u6df7\u5408\u30c7\u30a3\u30ea\u30af\u30ec\u5206\u5e03\u306e EM \u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3092\u5b9f\u969b\u306b\u6b21\u306e 3 \u6b21\u5143\u30c7\u30fc\u30bf\u306b\u9069\u7528\u3057\u3066\u307f\u307e\u3059.<br \/>\n\u672c\u5f53\u306f Societas \u306e\u5206\u985e\u6570\u3068\u540c\u3058 12 \u6b21\u5143\u3067\u3084\u3063\u3066\u307f\u305f\u3044\u306e\u3067\u3059\u304c\u3001\u5404\u30a4\u30c6\u30ec\u30fc\u30b7\u30e7\u30f3\u3067\u306e\u5206\u5e03\u3092\u53ef\u8996\u5316\u3057\u305f\u3044\u3067\u3059\u306e\u3067 3 \u6b21\u5143\u30c7\u30fc\u30bf\u306b\u3057\u307e\u3059.<br \/>\n\u30c7\u30fc\u30bf\u306f\u8db3\u3057\u305f\u3089 1 \u3067\u3059\u306e\u3067\u6b21\u306e\u3088\u3046\u306a\u30c7\u30fc\u30bf\u306e\u96c6\u5408\u306b\u306a\u308a\u307e\u3059.<\/p>\n<pre class=\"lang:python decode:true\">\r\nX = np.array([\r\n    [  0.5,   0.2,   0.3],\r\n    [  0.1,   0.4,   0.5],\r\n....\r\n<\/pre>\n<p>\u672c\u6765\u306f 3 \u6b21\u5143\u3067\u3059\u304c\u3001\u8db3\u3057\u305f\u3089 1 \u3068\u3044\u3046\u5236\u7d04\u304c\u3064\u3044\u3066\u3044\u307e\u3059\u306e\u3067\u30012 \u6b21\u5143\u4e0a\u306b\u30de\u30c3\u30d4\u30f3\u30b0\u3067\u304d\u3066\u3001\u4e0b\u56f3\u306e\u3088\u3046\u306a\u4e09\u89d2\u5f62\u9818\u57df\u5185\u306b\u5206\u5e03\u3057\u307e\u3059.<\/p>\n<p><a href=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/test_data.png\" rel=\"facebox\" rel=\"attachment wp-att-7753\"><img loading=\"lazy\" src=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/test_data.png\" alt=\"test_data\" width=\"440\" height=\"330\" class=\"alignnone size-full wp-image-7753\" srcset=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/test_data.png 440w, https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/test_data-300x225.png 300w\" sizes=\"(max-width: 440px) 100vw, 440px\" \/><\/a><\/p>\n<p>\u4eca\u56de\u306f 3 \u3064\u306e\u30c7\u30a3\u30ea\u30af\u30ec\u5206\u5e03\u304b\u3089\u751f\u6210\u3057\u305f\u3001\u3053\u3093\u306a\u611f\u3058\u306e\u30c7\u30fc\u30bf\u3092\u30c6\u30b9\u30c8\u5206\u5e03\u3068\u3057\u3066\u3001\u524d\u8ff0\u306e\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3067\u3053\u306e\u5206\u5e03\u30d1\u30e9\u30e1\u30fc\u30bf\u304c\u63a8\u5b9a\u3067\u304d\u308b\u304b\u3069\u3046\u304b\u3092\u8a66\u3057\u3066\u307f\u307e\u3059.<\/p>\n<p><a href=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/test_data_dist.png\" rel=\"facebox\" rel=\"attachment wp-att-7758\"><img loading=\"lazy\" src=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/test_data_dist.png\" alt=\"dmm_test_data_dist\" width=\"440\" height=\"330\" class=\"alignnone size-full wp-image-7758\" srcset=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/test_data_dist.png 440w, https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/test_data_dist-300x225.png 300w\" sizes=\"(max-width: 440px) 100vw, 440px\" \/><\/a><\/p>\n<p>\u521d\u671f\u5024\u306f\u9069\u5f53\u306b\u8a2d\u5b9a\u3057\u3066\u3001\u5404\u30a4\u30c6\u30ec\u30fc\u30b7\u30e7\u30f3\u3067\u306e\u63a8\u5b9a\u5206\u5e03\u306e\u63a8\u79fb\u3092\u898b\u3066\u307f\u307e\u3059.<\/p>\n<p><a href=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/dmm_iteration.gif\" rel=\"facebox\" rel=\"attachment wp-att-7756\"><img loading=\"lazy\" src=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/dmm_iteration.gif\" alt=\"dmm_iteration\" width=\"440\" height=\"330\" class=\"alignnone size-full wp-image-7756\" \/><\/a><\/p>\n<p>\u8d64\u70b9\u304c\u63a8\u5b9a\u5206\u5e03\u306e\u4e2d\u5fc3\u70b9(\u30c7\u30a3\u30ea\u30af\u30ec\u5206\u5e03\u306e\u671f\u5f85\u5024)\u3067\u3059\u304c\u3001\u30a4\u30c6\u30ec\u30fc\u30b7\u30e7\u30f3\u3092\u7e70\u308a\u8fd4\u3059\u306b\u3064\u308c\u3001\u30c6\u30b9\u30c8\u30c7\u30fc\u30bf\u306b\u8fd1\u3065\u3044\u3066\u3086\u304d\u307e\u3059.<br \/>\n\u8907\u6570\u306e\u5206\u5e03\u306e\u5883\u754c\u9818\u57df\u306f\u5fae\u5999\u3067\u3059\u304c\uff08\u3053\u306e\u9818\u57df\u306e\u53d6\u308a\u5408\u3044\u306b\u5206\u5e03\u30e2\u30c7\u30eb\u306e\u500b\u6027\u304c\u3067\u305d\u3046\u3067\u3059\u306d.\uff09\u3001\u306a\u3093\u3068\u306a\u304f(\u6c57)\u5143\u30c7\u30fc\u30bf\u306e\u751f\u6210\u5206\u5e03\u3092\u63a8\u5b9a\u3067\u304d\u3066\u3044\u305d\u3046\u3067\u3059.<\/li>\n<p>\u30a4\u30c6\u30ec\u30fc\u30b7\u30e7\u30f3\u6bce\u306e\u5bfe\u6570\u5c24\u5ea6\u306e\u63a8\u79fb\u3082\u30d7\u30ed\u30c3\u30c8\u3057\u307e\u3059.100 \u30a4\u30c6\u30ec\u30fc\u30b7\u30e7\u30f3\u304f\u3089\u3044\u3067\u53ce\u675f\u306b\u8fd1\u3065\u304d\u307e\u3059.<\/p>\n<p><a href=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/dmm_loglikelihood.png\" rel=\"facebox\" rel=\"attachment wp-att-7761\"><img loading=\"lazy\" src=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/dmm_loglikelihood.png\" alt=\"dmm_loglikelihood\" width=\"440\" height=\"330\" class=\"alignnone size-full wp-image-7761\" srcset=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/dmm_loglikelihood.png 440w, https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/dmm_loglikelihood-300x225.png 300w\" sizes=\"(max-width: 440px) 100vw, 440px\" \/><\/a><\/p>\n<pre class=\"lang:python decode:true\">\r\ndef data_creation_k3_d3(a, N=300):\r\n    \"\"\"\r\n    3 \u3064\u306e\u30c7\u30a3\u30ea\u30af\u30ec\u5206\u5e03\u304b\u3089\u30c6\u30b9\u30c8\u30c7\u30fc\u30bf\u3092\u4f5c\u6210\r\n    \"\"\"\r\n    x0 = np.array([np.random.dirichlet(a[0]) for i in range(N\/3)])\r\n    x1 = np.array([np.random.dirichlet(a[1]) for i in range(N\/3)])\r\n    x2 = np.array([np.random.dirichlet(a[2]) for i in range(N\/3)])\r\n    return np.vstack([x0, x1, x2])\r\n\r\nN=300 \r\n\r\n# \u6b63\u89e3\u30c7\u30fc\u30bf\u306e\u5206\u5e03\r\nalpha_org = np.array([\r\n    [1., 2., 5.],\r\n    [5., 3., 4.],\r\n    [2., 5., 1.]\r\n    ])\r\n    \r\n# \u30c7\u30fc\u30bf\u3092\u4f5c\u308b\r\nX = data_creation_k3_d3(alpha_org, N)\r\n\r\n# \u30c7\u30a3\u30ea\u30af\u30ec\u5206\u5e03\u30d1\u30e9\u30e1\u30fc\u30bf\u521d\u671f\u5024\u3092\u9069\u5f53\u306b\u8a2d\u5b9a(-> Kmeans \u306e\u4e2d\u5fc3\u70b9\u3092\u521d\u671f\u5024\u306b\u3057\u3066\u3082\u3088\u3044)\r\nalpha = np.array([\r\n    [1.5, 1., 1.], \r\n    [1., 1.5, 1.], \r\n    [1., 1., 1.5]])\r\n\r\n# \u5206\u5e03\u6df7\u5408\u6bd4\u7387 : \u4e00\u69d8\u5206\u5e03\u3067\u521d\u671f\u5316\r\npi = np.ones(K)\/K \r\n\r\n# \u5404\u30d9\u30af\u30c8\u30eb\u306e\u5404\u30af\u30e9\u30b9\u3078\u306e\u6240\u5c5e\u78ba\u7387\r\nresp = np.zeros((N, K))\r\n\r\n# E-M \u30b9\u30c6\u30c3\u30d7\u30fb\u30a4\u30c6\u30ec\u30fc\u30b7\u30e7\u30f3\r\nfor nstep in range(200):\r\n    \r\n    ###     \r\n    # [E-step]\r\n    E_step(K, X, alpha, pi, resp)\r\n\r\n    ### \r\n    # [M-step]\r\n    M_step(K, X, alpha, pi, resp)\r\n\r\n<\/pre>\n<h2>\u7d42\u308f\u308a\u306b<\/h2>\n<p>\u4eca\u56de\u306f\u7c21\u5358\u306a\u5b9f\u88c5\u3068\u30c7\u30fc\u30bf\u3067\u3084\u3063\u3066\u307f\u307e\u3057\u305f\u304c\u3001\u306a\u3093\u3068\u306a\u304f\u6df7\u5408\u5206\u5e03\u30e2\u30c7\u30eb\u306e\u30a4\u30e1\u30fc\u30b8\u304c\u3064\u304b\u3081\u305f\u3068\u601d\u3044\u307e\u3059\u3002<br \/>\n\u5b9f\u969b\u306b\u306f\u30c7\u30fc\u30bf\u30b5\u30a4\u30ba\u3084\u904b\u7528\u306a\u3069\u8af8\u3005\u7dcf\u5408\u7684\u306b\u5224\u65ad\u3057\u3066<sup>(5)<\/sup>\u3001\u5b9f\u7e3e\u306e\u3042\u308b\u6a5f\u68b0\u5b66\u7fd2\u30e9\u30a4\u30d6\u30e9\u30ea\u306e\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u3092\u63a1\u7528\u3059\u308b\u3053\u3068\u304c\u307b\u3068\u3093\u3069\u3067\u3057\u3087\u3046\u304c\u3001\u5b9f\u88c5\u3059\u308b\u3068\u7406\u89e3\u304c\u6df1\u307e\u308a\u9069\u5207\u306a\u30a2\u30eb\u30b4\u30ea\u30ba\u30e0\u306e\u9078\u629e\u3084\u30d1\u30e9\u30e1\u30fc\u30bf\u63a2\u7d22\u306a\u3069\u5f79\u7acb\u3064\u3053\u3068\u3082\u591a\u3044\u306e\u3067\u306f\u3068\u601d\u3044\u307e\u3059.<\/p>\n<p>\u5206\u5e03\u306e\u6df7\u5408\u3068\u6f5c\u5728\u7684\u5909\u6570\u306f\u30c8\u30d4\u30c3\u30af\u30e2\u30c7\u30eb\u3078\u3068\u7d9a\u304f\u6f5c\u5728\u7684\u610f\u5473\u89e3\u6790\u306e\u5165\u308a\u53e3\u3067\u3059\u306e\u3067\u3001\u6a5f\u4f1a\u304c\u3042\u308c\u3070\u3053\u306e\u9053\u3092\u8fbf\u3063\u3066\u307f\u305f\u3044\u3068\u601d\u3044\u307e\u3059.<\/p>\n<p>\u53c2\u8003\u3067\u3059\u304c\u3001\u79c1\u306f Emacs \u3092\u4f7f\u3063\u3066\u304a\u308a iimage-mode \u3067\u3088\u304f\u95a2\u6570\u306e\u30b3\u30e1\u30f3\u30c8\u306b\u6570\u5f0f\u3084\u624b\u63cf\u304d\u306e\u30dd\u30f3\u30c1\u7d75\u306e\u753b\u50cf\u3092\u5dee\u3057\u8fbc\u3093\u3060\u308a\u3057\u307e\u3059.\u5f8c\u3005\u3001\u898b\u306a\u304a\u3057\u305f\u308a\u3059\u308b\u969b\u306b\u3001\u306a\u304b\u306a\u304b\u4fbf\u5229\u3067\u3059.<\/p>\n<p><a href=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/iimage_mode.png\" rel=\"facebox\" rel=\"attachment wp-att-7742\"><img loading=\"lazy\" src=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/iimage_mode-300x163.png\" alt=\"iimage_mode\" width=\"300\" height=\"163\" class=\"alignnone size-medium wp-image-7742\" srcset=\"https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/iimage_mode-300x163.png 300w, https:\/\/www.techscore.com\/blog\/wp\/wp-content\/uploads\/2015\/06\/iimage_mode.png 617w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<pre class=\"lang:python decode:true\">\r\ndef calc_log_likelihood(X, alpha, pi):\r\n    \"\"\"\r\n    [[likelihood.png]] <-- M-x iimage-mode \u3067\u753b\u50cf\u304c\u30a4\u30f3\u30e9\u30a4\u30f3\u8868\u793a\u3055\u308c\u308b.\r\n    \"\"\"\r\n    ....\r\n<\/pre>\n<p><\u811a\u6ce8><br \/>\n1) \u4f8b\u3048\u3070 K-means \u3084 \u6df7\u5408\u30ac\u30a6\u30b9\u30e2\u30c7\u30eb\u306f\u3001\u30c7\u30fc\u30bf\u5909\u6570\u304c\u30ab\u30c6\u30b4\u30ea\u30ab\u30eb\u306a\u5834\u5408\u306b\u306f\u9069\u5207\u3067\u306f\u306a\u304b\u3063\u305f\u308a\u3057\u307e\u3059\uff08\u30e6\u30fc\u30af\u30ea\u30c3\u30c9\u8ddd\u96e2\u304c\u9069\u5207\u3067\u306f\u306a\u3044\u305f\u3081\uff09<br \/>\n2) \u5229\u70b9\u306e\u53cd\u9762\u3001\u300c\u30c4\u30fc\u30eb\u306e\u3067\u304d\u308b\u3053\u3068 &gt; \u81ea\u5206\u306e\u3067\u304d\u308b\u3053\u3068\u300d\u3001\u3068\u306a\u3063\u3066\u3057\u307e\u3046\u50be\u5411\u306b\u3042\u308a\u307e\u3059.<br \/>\n3) <a href=\"http:\/\/www.amazon.co.jp\/\u30d1\u30bf\u30fc\u30f3\u8a8d\u8b58\u3068\u6a5f\u68b0\u5b66\u7fd2-\u4e0b-\u30d9\u30a4\u30ba\u7406\u8ad6\u306b\u3088\u308b\u7d71\u8a08\u7684\u4e88\u6e2c-C-M-\u30d3\u30b7\u30e7\u30c3\u30d7\/dp\/4621061240\/ref=pd_sim_14_1?ie=UTF8&refRID=1K30DAYXG5YBTVPFGC2T\" target=\"_blank\">\u30d1\u30bf\u30fc\u30f3\u8a8d\u8b58\u3068\u6a5f\u68b0\u5b66\u7fd2 \u4e0b (\u30d9\u30a4\u30ba\u7406\u8ad6\u306b\u3088\u308b\u7d71\u8a08\u7684\u4e88\u6e2c)<\/a> 9.2.2\u7ae0<br \/>\n4) <a href=\"http:\/\/research.microsoft.com\/en-us\/um\/people\/minka\/papers\/dirichlet\/minka-dirichlet.pdf\" target=\"_blank\">Estimating a Dirichlet distribution.<\/a> \u6570\u30a4\u30c6\u30ec\u30fc\u30b7\u30e7\u30f3\u3067 10-14 \u6841\u7cbe\u5ea6\u306b\u306a\u308b\u305d\u3046\u3067\u3059.<br \/>\n5) \u6700\u9069\u306a\u30af\u30e9\u30b9\u30bf\u6570\u306e\u63a2\u7d22\u3084\u3001\u5c24\u5ea6\u95a2\u6570\u306e\u767a\u6563\u3059\u308b\u7279\u7570\u6027\u306b\u5bfe\u3059\u308b\u5bfe\u7b56\u3001\u7b49\u3005<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u30af\u30e9\u30b9\u30bf\u30ea\u30f3\u30b0\u306b\u306f\u69d8\u3005\u306a\u624b\u6cd5\u304c\u3042\u308a\u307e\u3059\u304c\u3001\u76ee\u7684\u3001\u30c7\u30fc\u30bf\u306e\u5206\u5e03\u306a\u3069\u306b\u5408\u308f\u305b\u3066\u9069\u5207\u306a\u3082\u306e\u3092\u9078\u629e\u3057\u307e\u3059(1).<br \/>\n\u4eca\u306f\u5f37\u529b\u306a\u30c4\u30fc\u30eb\u3001\u30e9\u30a4\u30d6\u30e9\u30ea\u304c\u63c3\u3063\u3066\u3044\u307e\u3059\u306e\u3067\u5927\u4f53\u306f\u305d\u308c\u3067\u307e\u304b\u306a\u3048\u3066\u3057\u307e\u3044\u307e\u3059\u3057\u3001\u69d8\u3005\u306a\u9ad8\u5ea6<br \/><a href=\"https:\/\/www.techscore.com\/blog\/2015\/06\/16\/dmm\/\">\u7d9a\u304d\u3092\u8aad\u3080...<\/a><\/p>\n","protected":false},"author":48,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[18],"tags":[95,202],"_links":{"self":[{"href":"https:\/\/www.techscore.com\/blog\/wp-json\/wp\/v2\/posts\/7649"}],"collection":[{"href":"https:\/\/www.techscore.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.techscore.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.techscore.com\/blog\/wp-json\/wp\/v2\/users\/48"}],"replies":[{"embeddable":true,"href":"https:\/\/www.techscore.com\/blog\/wp-json\/wp\/v2\/comments?post=7649"}],"version-history":[{"count":71,"href":"https:\/\/www.techscore.com\/blog\/wp-json\/wp\/v2\/posts\/7649\/revisions"}],"predecessor-version":[{"id":7809,"href":"https:\/\/www.techscore.com\/blog\/wp-json\/wp\/v2\/posts\/7649\/revisions\/7809"}],"wp:attachment":[{"href":"https:\/\/www.techscore.com\/blog\/wp-json\/wp\/v2\/media?parent=7649"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.techscore.com\/blog\/wp-json\/wp\/v2\/categories?post=7649"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.techscore.com\/blog\/wp-json\/wp\/v2\/tags?post=7649"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}