Ce que j'ai fait

Construire un environnement Big Gorilla Essayez l'exemple FlexMatcher

Ce que j'ai trouvé

――C'est une norme récente d'utiliser pyenv uniquement pour mettre anaconda et pour gérer l'environnement avec conda. ―― ~~ (Au 12 juillet 2017) La construction de l'environnement ne se passe pas bien ~~

~~ La dépendance de l'environnement conda initialement publié est rompue ~~ -Vous pouvez télécharger yml localement depuis ~~ Anaconda Cloud, supprimer la ligne qui spécifie urllib, et installer en spécifiant le fichier. .. ~~ --Ajout: Le fichier a été mis à jour pour pouvoir être saisi conformément au document officiel.
L'échantillon FlexMatcher ne fonctionnait pas non plus ――Il semble difficile de se déplacer sans lire le code

Que faire ensuite

Lire le code Flexmatcher

environnement

Mac OS X 10.11 El Capitan homebrew est déjà installé Installez anaconda en utilisant pyenv

Construction de l'environnement BigGorilla (Les informations suivantes sont anciennes. Elles sont conservées comme un enregistrement de travail)

pyenv était vieux, alors mettez à jour Mettre à jour la version de python gérée par pyenv --Qiita

Installer anaconda

$ pyenv　install　anaconda3-4.2.0
$ pyenv　global　anaconda3-4.2.0

Créer un environnement pour Big Gorilla. .. ~~ Je ne peux pas. ~~ 21/07/2017 postscript: C'est devenu possible. Sous l'ancien enregistrement

$ conda env create biggorilla/py3gorilla
Collecting urllib==1.21.1
Downloading urllib-1.21.1.tar.gz (226kB)
100% |████████████████████████████████| 235kB 640kB/s
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/bx/k4yrl_bd3nb0v8pz7fm60t8r0000gp/T/pip-build-58rsg5li/urllib/setup.py", line 191
s.connect((base64.b64decode(rip), 017620))
                                  ^
SyntaxError: invalid token
 ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/bx/k4yrl_bd3nb0v8pz7fm60t8r0000gp/T/pip-build-58rsg5li/urllib/
CondaValueError: Value error: pip returned an error.

Ce n'est pas complètement inclus, mais j'essaye de l'activer. Avec la source activée Py3 Gorilla, la coquille tombe. Si vous utilisez pyenv, vous devez spécifier la commande conda activate avec le chemin complet. Notes sur l'utilisation de Conda-Qiita Création d'un environnement python pour ceux qui souhaitent devenir des data scientists 2016 - Qiita

$ conda info -e
# conda environments:
#
Py3Gorilla               /Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0/envs/Py3Gorilla
root                  *  /Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0

$ source /Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0/envs/Py3Gorilla/activate Py3Gorilla

J'ai essayé le Jupyter NoteBook pour vérifier le fonctionnement, mais il indique que le noyau Py3 Gorilla est introuvable.

$ anaconda download biggorilla/hi_gorilla
$ jupyter notebook hi_gorilla.ipynb

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-1-770f0b5370fe> in <module>()
----> 1 import py_stringmatching as sm
      2
      3 # This notebook imports a package that most users do not have installed
      4 # before using BigGorilla. Running the notebook successfully implies the
      5 # successful installation of BigGorilla.

ImportError: No module named 'py_stringmatching'

Une fois conda env créé, on dit que le préfixe est enregistré. Pour le supprimer, utilisez conda env remove -n.

$ conda env create biggorilla/py3gorilla
Using Anaconda API: https://api.anaconda.org
CondaValueError: Value error: prefix already exists: /Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0/envs/Py3Gorilla

$ conda env remove -n Py3Gorilla

Package plan for package removal in environment /Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0/envs/Py3Gorilla:

The following packages will be REMOVED:

openssl:    1.0.2l-0
pip:        9.0.1-py36_1
python:     3.6.1-2
readline:   6.2-2
setuptools: 27.2.0-py36_0
sqlite:     3.13.0-0
tk:         8.5.18-0
wheel:      0.29.0-py36_0
xz:         5.2.2-1
zlib:       1.2.8-3

Proceed ([y]/n)? y

Unlinking packages ...
[      COMPLETE      ]|###############################################################################| 100%

~~ Quand je l'ai essayé le 12 juillet 2017, j'ai eu l'erreur suivante avec cette méthode et je n'ai pas entré. (Peut-être que le nom du fichier mis à jour en juin est erroné, je pense que l'ancien yml est appliqué. Il sera probablement corrigé par la mise à jour à partir de maintenant) ~~

Addendum: Le fichier a été mis à jour pour inclure la documentation officielle.

$ conda env create biggorilla/py3gorilla
Collecting urllib==1.21.1
Downloading urllib-1.21.1.tar.gz (226kB)
100% |████████████████████████████████| 235kB 640kB/s
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/var/folders/bx/k4yrl_bd3nb0v8pz7fm60t8r0000gp/T/pip-build-58rsg5li/urllib/setup.py", line 191
s.connect((base64.b64decode(rip), 017620))
                                  ^
SyntaxError: invalid token
 ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /private/var/folders/bx/k4yrl_bd3nb0v8pz7fm60t8r0000gp/T/pip-build-58rsg5li/urllib/
CondaValueError: Value error: pip returned an error.

Vous pouvez l'installer en téléchargeant yml depuis Files :: Anaconda Cloud et en supprimant la ligne qui spécifie urllib. Le plus récent yml peut être inclus, mais la version flexmatcher est ancienne (dégraissée?)

#Effacez l'environnement qui était autrefois à mi-chemin
$ conda env remove -n Py3Gorilla

#Recréez l'environnement en spécifiant le fichier yml modifié localement
$ vim ~/Downloads/Py3Gorilla.yml //Supprimer la ligne urllib
$ conda env create --name test --file ~/Downloads/Py3Gorilla.yml

#Si vous utilisez pyenv, vous devez spécifier la commande conda activate avec le chemin complet. Avec la source activée Py3 Gorilla, la coquille tombe.
$ source /Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0/envs/test/bin/activate test

#Déposez le notebook pour le contrôle de fonctionnement et démarrez
$ anaconda download biggorilla/hi_gorilla
$ jupyter notebook hi_gorilla.ipynb

Essayez l'exemple FlexMatcher

Ensuite, j'ai essayé l'exemple flexmatcher.

Exemple de code est joint, alors copiez la source et collez-la dans le notebook jupyter.

Après avoir essayé, j'ai trouvé que cela ne fonctionnait pas en raison d'une erreur.

Résultat d'exécution

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-34cd037abc3a> in <module>()
     27 mapping_list = [data1_mapping, data2_mapping]
     28 fm.create_training_data(schema_list, mapping_list)
---> 29 fm.train()
     30 
     31 # Creating a test schmea

/Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0/envs/test/lib/python3.5/site-packages/flexmatcher/flexmatcher.py in train(self)
     27     The class considers panda dataframes as databases and their column names as
     28     the schema. FlexMatcher learn to do schema matching by training on
---> 29     instances of dataframes and how their columns are matched against the
     30     mediated schema.
     31 

/Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0/envs/test/lib/python3.5/site-packages/flexmatcher/flexmatcher.py in <listcomp>(.0)
     27     The class considers panda dataframes as databases and their column names as
     28     the schema. FlexMatcher learn to do schema matching by training on
---> 29     instances of dataframes and how their columns are matched against the
     30     mediated schema.
     31 

/Users/kkanazaw/.pyenv/versions/anaconda3-4.2.0/envs/test/lib/python3.5/site-packages/flexmatcher/classify.py in predict_training(self, folds)

TypeError: 'float' object cannot be interpreted as an integer

Mémo de construction de l'environnement BigGorilla