Click on the Upload Translation Memory button: upload the TMX, accept the conditions and click on the term extraction button.
If you want, there will be an option to upload a dictionary together with the translation memory so that the extracted term pairs can be matched. The dictionary has to be a two-column text file that uses UTF-8 character encoding. The first column comprises entries in the source language and the second column entries in the target language, and the two columns are separated by the tabulator.
Click on the Select your terms button and choose the extraction you wants to work on from among the extractions that are displayed.
Once you have selected the extraction, you can begin to work on the term pair candidates extracted from the translation memory. The environment known as Itzulterm will be used for carrying out this work.
What is displayed by Itzulterm
The pairs in the source and target languages regarded as equivalent terms are displayed in a single table; they are ordered as per the degree of ‘security’ or ‘reliability’ that they are equivalent terms according to the selection algorithm. The frequency of the term pair is also displayed (column F).
To display the results, click on the See results button. When doing so, the user is presented with various options:
- To specify the minimum frequency that the term pairs have in the translation memory (Min. Frequency): 0, 1, 2, 5, 10, 15...
- To specify the number of candidates (Num. Pairs): 100, 1,000, 2,000...
- Whether or not to have the terms found in the dictionaries displayed in a table.
- User dictionary: together with the TMX memory the user can send a dictionary that is available or made up of exported words from previously made extractions. Once the extraction has been completed, there is an option whether or not to display, in the table, the term pairs that are in a previous dictionary and which have been extracted again (depending on the type of processing the user wants to do)
- Itzulterm dictionary: this option needs to be activated if the user wants to know whether the terms extracted by ElexBI are in the general Elhuyar database. This information may be of interest when a term pair is regarded as valid.
- The pairs found in the dictionaries are ordered at the start of the ranking, and the dictionaries where they can be found are specified in the DICT. column by means of icons.
- All valid initially: to regard as valid all the term pairs proposed in the results table of the automatic extraction; the next step is to reject the inappropriate ones. Marking the inappropriate pairs rather than the appropriate ones very often involves less work.
How to use Itzulterm
By clicking on certain buttons in this table, the user has various options for working on the results and making use of them.
- TEXT column: When clicking on the button , the contexts of the corresponding candidate pair can be displayed.
- EVAL. column: to validate () or reject () the corresponding term pair. If a term pair is validated, the terms appear highlighted in green; when rejected, they appear in red.
- The button beside each term : to have the ranking of the equivalents of the term candidate displayed; the system also provides the option of validating a candidate that is not the first one in the ranking (as in the EVAL.: column).
- There are other buttons in the context window in the units of each language in the translation memory segments. Their function is to improve the automatic extraction, i.e., to extract a term that has not been fully extracted, or to extract a term that may be present within an extracted candidate, etc. This is how it is used:
- button: the option to change a term extracted from that unit. All you need to do is select the desired syntagma, write the lemma in the 'Canonical form' box and click on the save button.
- button: from the unit of the other language in the same segment, click on the button and the new term extracted in the previous step will appear at the end of the list; click on the green button to validate the pair.
Once the term pairs the user wishes to include in the dictionary have been validated, various actions can be performed:
- A CSV file can be created: when clicking on the ‘Export results’ button, ItzulTerm exports all the term pairs that have been validated and their frequencies to a .csv file. This file may be useful for subsequent extraction purposes and can be uploaded as a dictionary together with a translation memory.
- A new dictionary can be created. Click on the 'Make new dictionary' button inside the 'Action' icon
- An already existing dictionary can be updated/fed: Click on the 'Update dictionary' button inside the 'Action' icon
To create a new dictionary, the steps appearing on screen need to be followed.
Step one: Describe the dictionary.
The dictionary has to be given a name, the system will create a code resulting from the name it has been given.
A description of the dictionary can be made here.
Click on the 'Action' -> 'Continue' button
Step two: Specify the dictionary’s domains.
In this step, the domains that you wish to assign to the dictionary concepts will be selected. The system gives two options for this:
- By selecting from among the default domains: to do this, click on the domains in the domains list provided
- Add the tree of concepts that specifies your intrinsic domains: to do this a domain list in TXT format will need to be uploaded
- Click on the 'Action' -> 'Continue' button
Step three: Describe the source that has been used for the dictionary.
In this step, information on the translation memory that has been uploaded will need to be provided. So it needs to be given a name, and a domain that describes the contents of the translation memory needs to be selected.
Click on the 'Action' -> 'Create dictionary' button
After validating the term pairs, click on the 'Action' -> 'Update dictionary' button, and follow the steps indicated on screen.
The first thing to do is to select the dictionary you want to feed (Select dictionary).
Then a description of the translation memory used needs to be provided: it needs to be given a name and have a domain assigned to it.
Click on the 'Action' -> 'Update dictionary' button
Edit/Publish your dictionary
The name of the terminology database is SareTerm.
From here there will be an option to access all the dictionary contents.
But apart from the contents there will be an option to manage all the dictionary information.
To do this, go to the Views section and select what you want to see or manage: concepts, images, domains, contexts, your dictionaries, sources, who has been using them, the proposals and comments, etc. received.
Click on views and all the lists will be displayed so that they can be managed. There will be an option of carrying out various actions using these lists.
From here it will also be possible to run various advanced searches.
There will be two ways of importing terminology that is in the database:
- 1. By importing terms extracted from a translation memory worked on previously
- By importing a CSV file
To do this, click on the Imports button and then on the option you want to use:
- 1. Load importation: using the database to work on the terminology extracted from a translation memory
- 2. Import from file: to create a database to use your CSV
Using the database to work on the terminology extracted from the translation memories
Click on the Imports ->Load importation button
From the table that is displayed, select the import you want in order to start working on the database.
To create a terminology database from a CSV file:
Click on the Imports -> Import from file button
You will need to select a CSV file and also select how the two language columns are to be separated.
The CSV file can include different information: various languages, definitions, domains. But a description (title) of the information included in the column must always be provided.
Two types of action can be performed when importing a CSV file: either to create a new dictionary or to update an existing one.
Depending on the action selected, the steps successively indicated by the interface will have to be followed. The steps for these action types will be the same as those explained in help sections 4 and 5.
When the importation (of the translation memory or CSV file) has been done, the terminology database will automatically display the list of concepts included in the dictionary.
You have information about the concepts in the columns:
- You will have a letter in the first column of each concept:
- N = new concept
- S = possible case of synonymy
- D = the same concept has been found (same equivalences in the two languages), but with a different domain. It may be a case of synonymy
- U = when feeding a previously existing dictionary, when a new piece of information is added to the same concept (e.g. new contexts)
- Concept id: The concept identifier is a number that is automatically applied
- Term in Basque (eu)
- Term in Spanish (es)
- Term in English (en)
- Term in French (fr)
- La: dog Latin
- Sy: symbol
- Dictionary: the code indicating the dictionary where the concept can be found
Actions that can be carried out:
- Create a new concept (New)
- Delete an existing concept (Delete)
- Create a new selection (Create new selection): there is an option to create and save a new list with the desired concepts in order to do further work on them
- Show subset (Show subset): to display only the desired concepts on screen
- Apply formula (Apply formula): to perform various actions simultaneously on the concepts
- Combine concepts (Combine concepts): to unify concepts in cases of synonymy
- Duplicate concepts (Duplicate concept): to copy another identical concept from a concept
- Update dictionary (Update dictionary): to publish the selected concepts directly in a browsing application
Description of the concept entry:
This dictionary is a terminology one and therefore the basic unit is the concept. All the information about a concept is represented in a terminology entry: domain(s), terms used to express this concept, contexts, definitions, articles and images.
Depending on the domain, a term may express various concepts (polysemy). In the dictionary the terms are organised by domain. Likewise, two terms may express the same concept of a family (synonymy). In this dictionary, the synonymous terms are saved in single concept entry; polysemy term meanings, by contrast, are saved individually in a concept entry./
Each concept may have the following information:
- Id. The concept is automatically given a code. It cannot be edited.
- Dictionary: The dictionary to which the concept belongs. It cannot be edited.
- Domain: The domain to which the concept belongs. A concept may be linked to more than one domain.
- Status: The status that the concept finds itself in (done or being processed).
- Terms: The terms are published in two languages: Basque (eu) and Spanish (es), but more languages can be incorporated.
- Definition: To write the definition of the concept.
- Source: Definition source.
- Relations: A concept can be related to other concepts (hypernymy, hyponymy, antonymy). The related term will appear as a link in the application browser.
- Information: Additional information about the concept can be included.
- Images: The concept can be accompanied by images.
- Articles: The concept can be accompanied by an article.
- Contexts: Context pairs linked to the terms appear in Basque and Spanish.
When editing the dictionary, don’t forget that work is being done on a specific concept entry.
How to edit a concept entry will be explained in the sections below:
- 1. Specify domain: one or more domains can be entered. When writing in the field box, the system will offer the list of domains that include what has been previously written.
- 2. Term: the lemma of the term needs to be entered. Here a previously created term can be edited by clicking on the term you wish to change. Or a new term can be created. To do this, click on the right-hand ‘action’ button and then click on the one that says ‘create new’.
More things can be done among the actions: select all, deselect all, delete, retrieve what has been deleted, etc.
The form that opens up needs to be filled in:
- Term: lemma of the term.
- Language: Basque or Spanish. An international symbol has to be selected in this field (eu or es).
- Category: noun, verb or adjective.
- Term type: whether the term is in the abbreviated or full form.
- Source: term source, where it has come from.
Click on the save button.
- 3. Images: A new image of the concept can be added. There are two options for this: click on the right-hand ‘action’ button and click on the one that says ‘create new’ or on the one that says ‘examine Wikipedia’.
If it is a new one, an archive image will be inserted, as long as the required conditions are met.
In addition, SareTerm provides the option of automatically incorporating images from Wikipedia (as long as the conditions for this purpose are complied with).
More actions can be performed: select all, deselect all, delete, retrieve what has been deleted, etc.
To upload an image, click on the camera icon and select the desired file. Below the icon there are boxes for providing additional image data.
- Licence type: Creative Commons Aitortu-PartekatuBerdin 4.0 NazioartekoaJabetza publikoaFor the purpose of publishing images in the dictionary the most appropriate licence is for the image to have the International Creative Commons Attribution ShareAlike 4.0 or a public property permission type. So it is advisable to add images that have licences of this type. Images with more restricted licences of use can also be uploaded, but to do so, it is the responsibility of the person doing the uploading to provide information certifying that the image can be used in the dictionary, so that the dictionary managers can verify that the image meets the conditions. The information on the licence or the transfer of use will be published together with the image.
- Licence version: The version of the licence will have to be stated.
- Original URL: The web address of the original image. The Wikimedia Commons address, or the web address that can be used to verify the acknowledged licence.
- Owner: If there is an ‘Acknowledge’ component in the image licence, this field must be completed.
- Description: A description of the image or other additional information can be added.
- Caption: The caption that appears in the dictionary will have to be written.
- 4. Article: An article can be added to the concept. To do this, click on the right-hand ‘action’ button and then click on the one that says ‘create new’. More actions can be performed: select all, deselect all, delete, retrieve what has been deleted, etc.
To create an article, the form will need to be filled in:
- Title: title of the article
- Contents: this is the place where the article is written, the resources for facilitating writing are provided in the section above it
- Licence: the licence type needed to publish the article will have to be specified
- Domain: select the domain to which the article belongs
- Status: to indicate the status of the article (done or being processed)
- Author: the author of the article will have to be stated
- 5. Contexts: There is an option to make changes to a context pair that has been created (by clicking on context) or to add a new context. To do this, click on the right-hand ‘action’ button and then click on the one that says ‘create new’.
To enter contexts manually, the boxes for providing additional data will need to be filled in.
- Language: the language of the context being entered. An international symbol has to be selected in this field (eu, es, en, fr, ...).
- Source text / Target text: the term to which the context is being provided has to be written or copied here.
- Source form / Target form: write the format in which the term of the concept entry appears in the recently entered context. After that, the ‘calculate position’ button has to be clicked on.
Other actions can also be performed on the contexts.
Let us assume that a new term, a synonym, has been added to the concept entry.
When a new term is added to the concept entry, the system automatically searches for the contexts which have terms entered in the contexts and will load them into the concept entry. These contexts appear in red in the concept entry and are referred to as incomplete contexts.
To complete the concept entry, the corresponding equivalent will have to be provided. That way, the context will become a completed context.
There can be as many of these synonymy cases as one wants.
To make any changes to the concept entry, the action button in the top right-hand corner has to be clicked on and the changes saved.
This action button can also be used to create a new concept entry from scratch or to delete an existing one.
Once all the concepts have been worked on and if they are considered to be ready to be included in the dictionary, they have to be published in the dictionary. To do this, go to the screen displaying all the concept entries, select all (or just the ones you want included in the dictionary by ticking the left-hand arrowx) and click on publish dictionary. To publish, click on the action button and depending on the work done:
- Publish the dictionary: a new dictionary will be created
- Feed the dictionary: the concepts worked on will be added to an already existing dictionary
Once the dictionary has been published, it can be browsed over the internet. To do this, a dedicated browsing application will be designed.
Searches in the browsing application can be made according to various criteria: language, domain and dictionary.
There will also be an option to run searches directly in other dictionaries. To do this, click on the little book and from there on the desired dictionary.
If you want to change or correct the content of a concept, the system offers two options for including them in the dictionary:
- You complete the information yourself to do this, you need to click on the pencil on the right. Once that has been done, you will be taken directly to the concept entry (SareTerm) and there the desired changes or updates can be made.
- Send a comment this is an option to send a comment about a concept to the dictionary administrator. To do this, click on the icon and fill in the form that is displayed.
If the consulted term is not in the dictionary, there will be an option to have it included. To system offers various options for doing this:
- 1. If you click on the option Add a term by using the contexts you yourself can add the concept to the dictionary by going to SareTerm. To do this, the system will display the context in which the searched term appears and from there you will have to complete a concept entry: search for the eu equivalent in the contexts, create an eu term, domain, dictionary and other information.
- 2. If you click on You can add a term to the dictionary , you will be taken directly to SareTerm and you will have to create the concept entry. But in this case, the whole entry will be empty and you will have to fill in everything.
- 3. If you select the Send Proposal option , the administrator will be asked to add a new concept to the dictionary. You will be asked to complete a form for this purpose.
As soon as you become an administrator, you will be able to make changes to the dictionary configuration. The aim of this configuration is to enable each user to adapt TermKate to his/her needs.
So to do this, click on the spanner icon in the bubbles at the top.
When you click on the icon, various configuration options will open up: