tesseract
3.05.02
|
#include "oldlist.h"
#include "efio.h"
#include "emalloc.h"
#include "featdefs.h"
#include "tessopt.h"
#include "ocrfeatures.h"
#include "clusttool.h"
#include "cluster.h"
#include <string.h>
#include <stdio.h>
#include <math.h>
#include "unichar.h"
#include "commontraining.h"
Go to the source code of this file.
Macros | |
#define | PROGRAM_FEATURE_TYPE "cn" |
Functions | |
DECLARE_STRING_PARAM_FLAG (D) | |
int | main (int argc, char **argv) |
void | WriteNormProtos (const char *Directory, LIST LabeledProtoList, const FEATURE_DESC_STRUCT *feature_desc) |
void | WriteProtos (FILE *File, uinT16 N, LIST ProtoList, BOOL8 WriteSigProtos, BOOL8 WriteInsigProtos) |
int | main (int argc, char *argv[]) |
Variables | |
CLUSTERCONFIG | CNConfig |
#define PROGRAM_FEATURE_TYPE "cn" |
Definition at line 40 of file cntraining.cpp.
DECLARE_STRING_PARAM_FLAG | ( | D | ) |
int main | ( | int | argc, |
char ** | argv | ||
) |
This program reads in a text file consisting of feature samples from a training page in the following format:
FontName UTF8-char-str xmin ymin xmax ymax page-number NumberOfFeatureTypes(N) FeatureTypeName1 NumberOfFeatures(M) Feature1 ... FeatureM FeatureTypeName2 NumberOfFeatures(M) Feature1 ... FeatureM ... FeatureTypeNameN NumberOfFeatures(M) Feature1 ... FeatureM FontName CharName ...
The result of this program is a binary inttemp file used by the OCR engine.
argc | number of command line arguments |
argv | array of command line arguments |
Definition at line 388 of file tesseractmain.cpp.
int main | ( | int | argc, |
char * | argv[] | ||
) |
This program reads in a text file consisting of feature samples from a training page in the following format:
FontName CharName NumberOfFeatureTypes(N) FeatureTypeName1 NumberOfFeatures(M) Feature1 ... FeatureM FeatureTypeName2 NumberOfFeatures(M) Feature1 ... FeatureM ... FeatureTypeNameN NumberOfFeatures(M) Feature1 ... FeatureM FontName CharName ...
It then appends these samples into a separate file for each character. The name of the file is
DirectoryName/FontName/CharName.FeatureTypeName
The DirectoryName can be specified via a command line argument. If not specified, it defaults to the current directory. The format of the resulting files is:
NumberOfFeatures(M) Feature1 ... FeatureM NumberOfFeatures(M) ...
The output files each have a header which describes the type of feature which the file contains. This header is in the format required by the clusterer. A command line argument can also be used to specify that only the first N samples of each class should be used.
argc | number of command line arguments |
argv | array of command line arguments |
Definition at line 133 of file cntraining.cpp.
void WriteNormProtos | ( | const char * | Directory, |
LIST | LabeledProtoList, | ||
const FEATURE_DESC_STRUCT * | feature_desc | ||
) |
This routine writes the specified samples into files which are organized according to the font name and character name of the samples.
Directory | directory to place sample files into |
LabeledProtoList | List of labeled protos |
feature_desc | Description of the features |
Definition at line 224 of file cntraining.cpp.
void WriteProtos | ( | FILE * | File, |
uinT16 | N, | ||
LIST | ProtoList, | ||
BOOL8 | WriteSigProtos, | ||
BOOL8 | WriteInsigProtos | ||
) |
Definition at line 263 of file cntraining.cpp.
CLUSTERCONFIG CNConfig |
Definition at line 76 of file cntraining.cpp.