質問

sage/RとPandas(Sage)でのデータフレームの相互変換の 「RからPandasへのデータフレーム変換 」がSage6.3で動作しないという問い合わせがありました。 以下のファイルをダウンロードしました。

pandasのインストール

ターミナルを起動します。 私は、Sage6.3.appを/Applicationsにインストールしましたので、以下のコマンドを起動します。
$ /Applications/Sage-6.3.app/Contents/Resources/sage/sage -sh
Starting subshell with Sage environment variables set.  Don't forget
to exit when you are done.  Beware:
 * Do not do anything with other copies of Sage on your system.
 * Do not use this for installing Sage packages using "sage -i" or for
   running "make" at Sage's root directory.  These should be done
   outside the Sage shell.

Bypassing shell configuration files...

Note: SAGE_ROOT=/Applications/Sage-6.3.app/Contents/Resources/sage

 (sage-sh) $ 
ここで、easy_installを使ってpandasをインストールします。
 (sage-sh) $ easy_install pandas
Searching for pandas
Reading https://pypi.python.org/simple/pandas/
Best match: pandas 0.14.1

途中省略
Installed /Applications/Sage-6.3.app/Contents/Resources/sage/local/lib/python2.7/site-packages/pandas-0.14.1-py2.7-macosx-10.7-x86_64.egg
Processing dependencies for pandas
Finished processing dependencies for pandas

現象の再現

以下の手順でnotebookで現象を再現してみました。 {{{id=1| # jsonliteをインストール r("install.packages('jsonlite')") /// NULL }}} {{{id=2| # R Graphic Cookbookのデータをインストール r("install.packages('gcookbook')") /// NULL }}} {{{id=3| # numpy, pandasのインポート import pandas as pd import numpy as np /// }}} {{{id=4| # jsonliteのインポート r('library(jsonlite)') /// [1] "jsonlite" "stats" "graphics" "grDevices" "utils" "datasets" "methods" "base" }}} {{{id=5| # gcookbookのインポート r('library(gcookbook)') /// [1] "gcookbook" "jsonlite" "stats" "graphics" "grDevices" "utils" "datasets" "methods" [9] "base" }}} {{{id=6| # RからJSON形式でデータを持ってくる方法 # 例として、gcookbookのサンプルデータをRから取得する test_json = r('toJSON(heightweight, pretty=FALSE)') /// }}} {{{id=7| # 以下のコメントを外すと値が確認できます。 # test_json /// }}} {{{id=8| heightweight = pd.read_json(sageobj(test_json)) heightweight.head() /// Traceback (most recent call last): File "", line 1, in File "_sage_input_13.py", line 10, in exec compile(u'open("___code___.py","w").write("# -*- coding: utf-8 -*-\\n" + _support_.preparse_worksheet_cell(base64.b64decode("aGVpZ2h0d2VpZ2h0ID0gcGQucmVhZF9qc29uKHNhZ2VvYmoodGVzdF9qc29uKSkKaGVpZ2h0d2VpZ2h0LmhlYWQoKQ=="),globals())+"\\n"); execfile(os.path.abspath("___code___.py"))' + '\n', '', 'single') File "", line 1, in File "/private/var/folders/jx/7nsrq4lw8xj553006s7bvdb80000gn/T/tmpf7Jqyg/___code___.py", line 2, in heightweight = pd.read_json(sageobj(test_json)) File "/Applications/Sage-6.3.app/Contents/Resources/sage/local/lib/python2.7/site-packages/pandas-0.14.1-py2.7-macosx-10.7-x86_64.egg/pandas/io/json.py", line 198, in read_json date_unit).parse() File "/Applications/Sage-6.3.app/Contents/Resources/sage/local/lib/python2.7/site-packages/pandas-0.14.1-py2.7-macosx-10.7-x86_64.egg/pandas/io/json.py", line 266, in parse self._parse_no_numpy() File "/Applications/Sage-6.3.app/Contents/Resources/sage/local/lib/python2.7/site-packages/pandas-0.14.1-py2.7-macosx-10.7-x86_64.egg/pandas/io/json.py", line 483, in _parse_no_numpy loads(json, precise_float=self.precise_float), dtype=None) TypeError: Expected String or Unicode }}} {{{id=9| sageobj(test_json) /// {'_r_class': 'json', 'DATA': '[{"sex":"f","ageYear":11.92,"ageMonth":143,"heightIn":56.3,"weightLb":85},{"sex":"f","ageYear":12.92,"ageMonth":155,"heightIn":62.3,"weightLb":105},{"sex":"f","ageYear":12.75,"ageMonth":153,"heightIn":63.3,"weightLb":108},{"sex":"f","ageYear":13.42,"ageMonth":161,"heightIn":59,"weightLb":92},{"sex":"f","ageYear":15.92,"ageMonth":191,"heightIn":62.5,"weightLb":112.5}, 途中省略 {"sex":"m","ageYear":12.58,"ageMonth":151,"heightIn":59.3,"weightLb":87}]'} }}}

原因

sageobjの出力で'DATA'にjson形式のデータ入っていることがわかりましたので、これを抽出してpd.read_jsonに渡すように修正します。

{{{id=10| sageobj(test_json)['DATA'] /// '[{"sex":"f","ageYear":11.92,"ageMonth":143,"heightIn":56.3,"weightLb":85},{"sex":"f","ageYear":12.92,"ageMonth":155,"heightIn":62.3,"weightLb":105},{"sex":"f","ageYear":12.75,"ageMonth":153,"heightIn":63.3,"weightLb":108},{"sex":"f","ageYear":13.42,"ageMonth":161,"heightIn":59,"weightLb":92},{"sex":"f","ageYear":15.92,"ageMonth":191,"heightIn":62.5,"weightLb":112.5},{"sex":"f","ageYear":14.25,"ageMonth":171,"heightIn":62.5,"weightLb":112}, 途中省略 {"sex":"m","ageYear":13.92,"ageMonth":167,"heightIn":62,"weightLb":107.5},{"sex":"m","ageYear":12.58,"ageMonth":151,"heightIn":59.3,"weightLb":87}]' }}} {{{id=11| heightweight = pd.read_json(sageobj(test_json)['DATA']) heightweight.head() /// ageMonth ageYear heightIn sex weightLb 0 143 11.92 56.3 f 85.0 1 155 12.92 62.3 f 105.0 2 153 12.75 63.3 f 108.0 3 161 13.42 59.0 f 92.0 4 191 15.92 62.5 f 112.5 }}} {{{id=14| /// }}}