sul server, il percorso per dump CSV di Meta zippato dovrebbe essere: /vltd/data/meta/dump/meta-csv-dump-2023-10/csv_output_current.zip
mandato il processo con il seguente comando sul server:
(venvpy311) eliarz@OC-SRV:~/omid-openalex$ python -m omid_openalex.main -c config_server.yaml
Il file di configurazione config_server.yaml
Γ¨ questo (ancora presente nella repo):
(venvpy311) eliarz@OC-SRV:~/omid-openalex$ python -m omid_openalex.main -c config_server.yaml
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 26104/26104 [24:53<00:00, 17.48it/s] | 3/469 [00:49<2:08:21, 16.53s/it] 55%|ββββββββββββββββββββββββββββββββββββββββββββββββββ | 260/469 [3:10:38<2:43:11, 46.85s/it]100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 469/469 [5:25:26<00:00, 41.63s/it]
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 130/130 [00:07<00:00, 17.79it/s]
Processing ../openalex_process/openalex_tables/works: 100%|ββββββββββββββββββββββββββββββββββ| 19428/19428 [05:23<00:00, 60.12file/s]
Creating index...
Creating and indexing the database table for DOIs took 7.287015442053477 minutes
Processing ../openalex_process/openalex_tables/works: 100%|ββββββββββββββββββββββββββββββββββ| 19428/19428 [03:14<00:00, 99.82file/s]
Creating index...
Creating and indexing the database table for PMIDs took 3.5944064259529114 minutes
Processing ../openalex_process/openalex_tables/works: 100%|βββββββββββββββββββββββββββββββββ| 19428/19428 [02:44<00:00, 117.98file/s]
Creating index...
Creating and indexing the database table for PMCIDs took 2.8042073686917623 minutes
Processing ../openalex_process/openalex_tables/sources: 100%|ββββββββββββββββββββββββββββββββββββββ| 30/30 [00:00<00:00, 71.22file/s]
Creating index...
Creating and indexing the database table for ISSNs took 0.00852201779683431 minutes
Processing ../openalex_process/openalex_tables/sources: 100%|βββββββββββββββββββββββββββββββββββββ| 30/30 [00:00<00:00, 103.47file/s]
Creating index...
Creating and indexing the database table for WIKIDATAs took 0.005462292830149333 minutes
Processing ../openalex_process/meta_ids/primary_ents: 100%|ββββββββββββββββββββββββββββββββββββ| 9075/9075 [43:47<00:00, 3.45file/s]
[x] copia il file βbr.zipβ sul server allβinterno della cartella omid-openalex/meta_rdf_2023_10
con il comando seguente (eseguito da E:/
in CMD) (prova e vedi se funziona):
scp .\\br.zip [[email protected]](<mailto:[email protected]>):/home/eliarz/meta_rdf_2023_10/
[x] una volta copiato il file, manda il processo eseguendo semplicemente il file prov.py
con il seguente comando (prova): python -m analytics.prov -c prov_analysis_config_server.yaml
scp [[email protected]](<mailto:[email protected]>):/home/eliarz/openalex_analytics/provenance_analysis_results.json .
Processing multi-mapped OMIDs: 41678row [7:14:14, 1.03s/row]
{'sources': defaultdict(<class 'int'>, {'A': 995, 'non classified': 599}), 'works': defaultdict(<class 'int'>, {'A': 1993, 'D': 27213, 'EFG': 87, 'non classified': 9113})} processed works count 38406 processed sources count 1594