Jump to content

Java:read All Files In Directory,merging Files,identify The Data Structure?


Recommended Posts

On PMID-1334619.txt,it contains these texts,how am I supposed to read them word by word and store them into a data structure?

Reading word line by line,word by word?

 

Hydroxylated kininogens and kinins.

Hydroxyprolyl-3-bradykinin was identified in the digest of purified human high molecular weight (H) kininogen with plasma kallikrein. Hydroxyproline was not detected in the heavy and light chains portions of H kininogen, although they include three possible sites for hydroxylation of proline by proline hydroxylase. The content of hydroxyprolyl-3-bradykinin in H kininogen from individual plasmas varied from 14% to 64% of total kinin. The present results and our previous results indicate that only kinin moity in H kininogen from human and monkey plasmas has been partially hydroxylated post-translationally by proline-4-hydroxylase.

 

 

How do I merge .a1,a2 together?

How do I store them I data structures?For example T1(ID),Protein (Start Offset)(End_Offset) (Event)

 

PMID-1334619.a1

 

T1 Protein 13 23 kininogens

T2 Protein 28 34 kinins

T3 Protein 53 63 bradykinin

T4 Protein 111 146 high molecular weight (H) kininogen

T5 Protein 152 169 plasma kallikrein

T6 Protein 245 256 H kininogen

T7 Protein 385 395 bradykinin

T8 Protein 399 410 H kininogen

T9 Protein 467 472 kinin

T10 Protein 538 543 kinin

T11 Protein 553 564 H kininogen

 

 

It also may link to other files.

 

PMID-1334619.a2

 

T12 Hydroxylation 0 12 Hydroxylated

T13 Hydroxylation 614 626 hydroxylated

E1 Hydroxylation:T12 Theme:T1

E2 Hydroxylation:T12 Theme:T2

E3 Hydroxylation:T13 Theme:T10

 

https://sites.google.../site/bionlpst/

I phrase it simpler:

  1. Read all files from a folder in the same directory
  2. Files with the same name but different extension
  3. On .txt extensions,count the number of words,the start and end offset of each word,output into a textfiles
  4. Merge a1,a2,ann,rel(other file extensions),on read each line,and store it into a data-structure
  5. Join results of 3,4 and output into a new folder <previousfolder>_result

What are the operations that I can use to achieve this with java?

 

 

On PMID-1334619.txt,it contains these texts,how am I supposed to read them word by word and store them into a data structure?

Reading word line by line,word by word?

 

Hydroxylated kininogens and kinins.

Hydroxyprolyl-3-bradykinin was identified in the digest of purified human high molecular weight (H) kininogen with plasma kallikrein. Hydroxyproline was not detected in the heavy and light chains portions of H kininogen, although they include three possible sites for hydroxylation of proline by proline hydroxylase. The content of hydroxyprolyl-3-bradykinin in H kininogen from individual plasmas varied from 14% to 64% of total kinin. The present results and our previous results indicate that only kinin moity in H kininogen from human and monkey plasmas has been partially hydroxylated post-translationally by proline-4-hydroxylase.

 

 

How do I merge .a1,a2 together?

How do I store them I data structures?For example T1(ID),Protein (Start Offset)(End_Offset) (Event)

 

PMID-1334619.a1

 

T1 Protein 13 23 kininogens

T2 Protein 28 34 kinins

T3 Protein 53 63 bradykinin

T4 Protein 111 146 high molecular weight (H) kininogen

T5 Protein 152 169 plasma kallikrein

T6 Protein 245 256 H kininogen

T7 Protein 385 395 bradykinin

T8 Protein 399 410 H kininogen

T9 Protein 467 472 kinin

T10 Protein 538 543 kinin

T11 Protein 553 564 H kininogen

 

 

It also may link to other files.

 

PMID-1334619.a2

 

T12 Hydroxylation 0 12 Hydroxylated

T13 Hydroxylation 614 626 hydroxylated

E1 Hydroxylation:T12 Theme:T1

E2 Hydroxylation:T12 Theme:T2

E3 Hydroxylation:T13 Theme:T10

 

https://sites.google.../site/bionlpst/

System.out.TheStuffIMentionedAndJustDoIT('/path/to/directory/read/', 'outputfile.txt');

 

Also I dig some digging and I found ALL your answers here.

Edited by ignace
This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.