%%output=lang_def == Definition :<> === Objects === @[predefinedtypes|] ==== Atoms and Sequences All data **objects** in Euphoria are either **atoms** or **sequences**. An **atom** is a single numeric value. A **sequence** is a collection of objects, either atoms or sequences themselves. A sequence can contain any mixture of atom and sequences; a sequence does not have to contain all the same data type. Because the **objects** contained in a sequence can be an arbitrary mix of atoms or sequences, it is an extremely versatile data structure, capable of representing any sort of data. A sequence is represented by a list of objects in brace brackets **{ }**, separated by commas with an optional sequence terminator, ##$##. Atoms can have any integer or double-precision floating point value. They can range from approximately -1e300 (minus one times 10 to the power 300) to +1e300 with 15 decimal digits of accuracy. Here are some Euphoria objects: -- examples of atoms: 0 1000 98.6 -1e6 23_100_000 x $ -- examples of sequences: {2, 3, 5, 7, 11, 13, 17, 19} {1, 2, {3, 3, 3}, 4, {5, {6}}} {{"jon", "smith"}, 52389, 97.25} {} -- the 0-element sequence By default, number literals use //base 10//, but you can have integer literals written in other bases, namely binary //(base 2)//, octal //(base 8)//, and hexadecimal //(base 16)//. To do this, the number is prefixed by a 2-character code that lets Euphoria know which base to use. |= Code |= Base | | 0b | 2 = **B**inary | | 0t | 8 = Oc**t**al | | 0d | 10 = **D**ecimal | | 0x | 16 = He**x**adecimal | For example: 0b101 --> decimal 5 0t101 --> decimal 65 0d101 --> decimal 101 0x101 --> decimal 257 Additionally, hexadecimal integers can also be written by prefixing the number with the '#' character. For example: #FE -- 254 #A000 -- 40960 #FFFF00008 -- 68718428168 -#10 -- -16 Only digits and the letters A, B, C, D, E, F, in either uppercase or lowercase, are allowed in hexadecimal numbers. Hexadecimal numbers are always positive, unless you add a minus sign in front of the # character. So for instance #FFFFFFFF is a huge positive number (4294967295), **not** ##-1##, as some machine-language programmers might expect. Sometimes, and especially with large numbers, it can make reading numeric literals easier when they have embedded grouping characters. We are familiar with using commas (periods in Europe) to group large numbers by three-digit subgroups. In Euphoria we use the underscore character to achieve the same thing, and we can group them anyway that is useful to us. atom big = 32_873_787 -- Set 'big' to the value 32873787 atom salary = 56_110.66 -- Set salary to the value 56110.66 integer defflags = #0323_F3CD object phone = 61_3_5536_7733 integer bits = 0b11_00010_1 **Sequences** can be nested to any depth, i.e. you can have sequences within sequences within sequences and so on to any depth (until you run out of memory). Brace brackets are used to construct sequences out of a list of expressions. These expressions can be constant or evaluated at run-time. e.g. { x+6, 9, y*w+2, sin(0.5) } All sequences can include a special //end of sequence// marker which is the ##$## character. This is for convience of editing lists that may change often as development proceeds. sequence seq_1 = { 10, 20, 30, $ } sequence seq_2 = { 10, 20, 30 } equal(seq_1, seq_2) -- TRUE The **"Hierarchical Objects"** part of the Euphoria acronym comes from the hierarchical nature of nested sequences. This should not be confused with the class hierarchies of certain object-oriented languages. Why do we call them atoms? Why not just "numbers"? Well, an ##atom## //is// just a number, but we wanted to have a distinctive term that emphasizes that they are indivisible (that's what "atom" means in Greek). In the world of physics you can 'split' an atom into smaller parts, but you no longer have an atom~--only various particles. You can 'split' a number into smaller parts, but you no longer have a number~--only various digits. Atoms are the basic building blocks of all the data that a Euphoria program can manipulate. With this analogy, **sequence**s might be thought of as "molecules", made from atoms and other molecules. A better analogy would be that sequences are like directories, and atoms are like files. Just as a directory on your computer can contain both files and other directories, a sequence can contain both atoms and other sequences (and //those// sequences can contain atoms and sequences and so on). {{{ . object . / \ . / \ . atom sequence }}} As you will soon discover, sequences make Euphoria very simple //and// very powerful. **Understanding atoms and sequences is the key to understanding Euphoria.** ;**Performance Note~:** :Does this mean that all atoms are stored in memory as eight-byte floating-point numbers? No. The Euphoria interpreter usually stores integer-valued atoms as machine integers (four bytes) to save space and improve execution speed. When fractional results occur or integers get too big, conversion to IEEE eight-byte floating-point format happens automatically. ==== Character Strings and Individual Characters A **character string** is just a ##sequence## of characters. It may be entered in a number of ways ... * Using double-quotes e.g. "ABCDEFG" * Using raw string notation e.g. -- Using back-quotes `ABCDEFG` or -- Using three double-quotes """ABCDEFG""" * Using binary strings e.g. b"1001 00110110 0110_0111 1_0101_1010" -- ==> {#9,#36,#67,#15A} * Using hexadecimal byte strings e.g. x"65 66 67 AE" -- ==> {#65,#66,#67,#AE} When you put too many hex characters together they are split up appropriately for you: x"656667AE" -- 8-bit ==> {#65,#66,#67,#AE} **The rules for double-quote strings are:** # They begin and end with a double-quote character # They cannot contain a double-quote # They must be only on a single line # They cannot contain the TAB character # If they contain the back-slash '\' character, that character must immediately be followed by one of the special //escape// codes. The back-slash and escape code will be replaced by the appropriate single character equivalent. If you need to include double-quote, end-of-line, back-slash, or TAB characters inside a double-quoted string, you need to enter them in a special manner. e.g. "Bill said\n\t\"This is a back-slash \\ character\".\n" Which, when displayed should look like ... {{{ Bill said "This is a back-slash \ character". }}} **The rules for raw strings are:** # Enclose with three double-quotes {{{"""..."""}}} or back-quote. {{{`...`}}} # The resulting string will never have any carriage-return characters in it. # If the resulting string begins with a new-line, the initial new-line is removed and any trailing new-line is also removed. # A special form is used to automatically remove leading whitespace from the source code text. You might code this form to align the source text for ease of reading. If the first line after the raw string start token begins with one or more underscore characters, the number of consecutive underscores signifies the maximum number of whitespace characters that will be removed from each line of the raw string text. The underscores represent an assumed left margin width. **Note**, these leading underscores do not form part of the raw string text. e.g. -- No leading underscores and no leading whitespace ` Bill said "This is a back-slash \ character". ` Which, when displayed should look like ... {{{ Bill said "This is a back-slash \ character". }}} -- No leading underscores and but leading whitespace ` Bill said "This is a back-slash \ character". ` Which, when displayed should look like ... {{{ Bill said "This is a back-slash \ character". }}} -- Leading underscores and leading whitespace ` _____Bill said "This is a back-slash \ character". ` Which, when displayed should look like ... {{{ Bill said "This is a back-slash \ character". }}} Extended string literals are useful when the string contains new-lines, tabs, or back-slash characters because they do not have to be entered in the special manner. The back-quote form can be used when the string literal contains a set of three double-quote characters, and the triple quote form can be used when the text literal contains back-quote characters. If a literal contains both a back quote and a set of three double-quotes, you will need to concatenate two literals. object TQ, BQ, QQ TQ = `This text contains """ for some reason.` BQ = """This text contains a back quote ` for some reason.""" QQ = """This text contains a back quote ` """ & `and """ for some reason.` **The rules for binary strings are...** # they begin with the pair ##b"## and end with a double-quote (##"##) character # they can only contain binary digits (0-1), and space, underscore, tab, newline, carriage-return. Anything else is invalid. # an underscore is simply ignored, as if it was never there. It is used to aid readability. # each set of contiguous binary digits represents a single sequence element # they can span multiple lines # The non-digits are treated as punctuation and used to delimit individual values. b"1 10 11_0100 01010110_01111000" == {0x01, 0x02, 0x34, 0x5678} **The rules for hexadecimal strings are:** # They begin with the pair ##x"## and end with a double-quote (##"##) character # They can only contain hexadecimal digits (0-9 A-F a-f), and space, underscore, tab, newline, carriage-return. Anything else is invalid. # An underscore is simply ignored, as if it was never there. It is used to aid readability. # Each pair of contiguous hex digits represents a single sequence element with a value from 0 to 255 # They can span multiple lines # The non-digits are treated as punctuation and used to delimit individual values. x"1 2 34 5678_AbC" == {0x01, 0x02, 0x34, 0x56, 0x78, 0xAB, 0x0C} Character strings may be manipulated and operated upon just like any other sequences. For example the string we first looked at "ABCDEFG" is entirely equivalent to the sequence: {65, 66, 67, 68, 69, 70, 71} which contains the corresponding ASCII codes. The Euphoria compiler will immediately convert "ABCDEFG" to the above sequence of numbers. In a sense, there are no "strings" in Euphoria, only sequences of numbers. A quoted string is really just a convenient notation that saves you from having to type in all the ASCII codes. @[emptyseq|] It follows that "" is equivalent to {}. Both represent the sequence of zero length, also known as the **empty sequence**. As a matter of programming style, it is natural to use "" to suggest a zero length sequence of characters, and {} to suggest some other kind of sequence. An **individual character** is an **atom**. It must be entered using single quotes. There is a difference between an individual character (which is an atom), and a character string of length 1 (which is a sequence). e.g. 'B' -- equivalent to the atom 66 - the ASCII code for B "B" -- equivalent to the sequence {66} Again, ##'B'## is just a notation that is equivalent to typing ##66##. There are no "characters" in Euphoria, just numbers (atoms). However, it is possible to use characters without ever having to use their numerical representation. Keep in mind that an atom is //not// equivalent to a one-element sequence containing the same value, although there are a few built-in routines that choose to treat them similarly. ====Escaped Characters==== Special characters may be entered using a back-slash: |=Code | Meaning| | \n | newline | | \r | carriage return | | \t | tab | | {{{\\}}} | backslash | | \" | double quote | | \' | single quote | | \0 | null | | \e | escape | | \E | escape | | \b/d..d/ | A binary coded value, the \b is followed by 1 or more binary digits. \\ Inside strings, use the space character to delimit or end a binary value. | \x/hh/ | A 2-hex-digit value, e.g. "\x5F" ==> {95} | | \u/hhhh/ | A 4-hex-digit value, e.g. "\u2A7C" ==> {10876} | | \U/hhhhhhhh/ | An 8-hex-digit value, e.g. "\U8123FEDC" ==> {2166619868} | For example, ##"Hello, World!\n"##, or ##'~\~\'##. The demonstration editor ##edx## displays character strings in green. Note that you can use the underscore character ##'_'## inside the ##\b##, ##\x##, ##\u##, and ##\U## values to aid readability, e.g. ##"\U8123_FEDC" ==> {2166619868}## === Identifiers An identifier is just the name you give something in your program. This can be a variable, constant, function, procedure, parameter, or namespace. An identifier must begin with either a letter or an underscore, then followed by zero or more letters, digits or underscore characters. There is no theoretical limit to how large an identifier can be but in practice it should be no more than about 30 characters. Identifiers are **case-sensitive**. This means that ##"Name"## is a different identifier from ##"name"##, or ##"NAME"##, etc... Examples of valid identifiers~: n color26 ShellSort quick_sort a_very_long_indentifier_that_is_really_too_long_for_its_own_good _alpha Examples of invalid identifiers~: 0n -- must not start with a digit ^color26 -- must not start with a punctuation character Shell Sort -- Cannot have spaces in identifiers. quick-sort -- must only consist of letters, digits or underscore. @[source_comments|] === Comments Comments are ignored by Euphoria and have no effect on execution speed. For example the ##edx## editor displays comments in red. There are three forms of comment text: * The //line// format comment is started by two dashes and extends to the end of the current line. e.g. -- This is a comment which extends to the end of this line only. * The //multi-line// format comment is started by ##/*## and extends to the next occurrence of ##*/##, even if that occurs on a different line. e.g. /* This is a comment which extends over a number of text lines. */ * On the first line only of your program, you can use a special comment beginning with the two character sequence ###!##. This is mainly used to tell //Unix// shells which program to execute the 'script' program with. e.g. #!/home/rob/euphoria/bin/eui This informs the Linux shell that your file should be executed by the Euphoria interpreter, and gives the full path to the interpreter. If you make your file executable, you can run it, just by typing its name, and without the need to type "##eui##". On //Windows// this line is just treated as a comment (though Apache Web server on //Windows// does recognize it.). If your file is a shrouded ##.il## file, use ##eub.exe## instead of ##eui##. Line comments are typically used to annotate a single (or small section) of code, whereas multi-line comments are typically used to give larger pieces of documentation inside the source text. === Expressions Like other programming languages, Euphoria lets you calculate results by forming expressions. However, in Euphoria you can perform calculations on entire sequences of data with one expression, where in most other languages you would have to construct a loop. In Euphoria you can handle a sequence much as you would a single number. It can be copied, passed to a subroutine, or calculated upon as a unit. For example, {1,2,3} + 5 is an expression that adds the sequence ##{1,2,3}## and the ##atom 5## to get the resulting sequence ##{6,7,8}##. We will see more examples later. ==== Relational Operators The relational operators **##< > <= >= = !=##** each produce a ##1## (true) or a ##0## (false) result. 8.8 < 8.7 -- 8.8 less than 8.7 (false) -4.4 > -4.3 -- -4.4 greater than -4.3 (false) 8 <= 7 -- 8 less than or equal to 7 (false) 4 >= 4 -- 4 greater than or equal to 4 (true) 1 = 10 -- 1 equal to 10 (false) 8.7 != 8.8 -- 8.7 not equal to 8.8 (true) As we will soon see you can also apply these operators to sequences. ==== Logical Operators The logical operators ##and##, ##or##, ##xor##, and ##not## are used to determine the "truth" of an expression. e.g. 1 and 1 -- 1 (true) 1 and 0 -- 0 (false) 0 and 1 -- 0 (false) 0 and 0 -- 0 (false) 1 or 1 -- 1 (true) 1 or 0 -- 1 (true) 0 or 1 -- 1 (true) 0 or 0 -- 0 (false) 1 xor 1 -- 0 (false) 1 xor 0 -- 1 (true) 0 xor 1 -- 1 (true) 0 xor 0 -- 0 (false) not 1 -- 0 (false) not 0 -- 1 (true) You can also apply these operators to numbers other than ##1## or ##0##. The rule is: zero means false and non-zero means true. So for instance: 5 and -4 -- 1 (true) not 6 -- 0 (false) These operators can also be applied to sequences. See below. In some cases [[:short_circuit]] evaluation will be used for expressions containing ##and## or ##or##. Specifically, short circuiting applies inside decision making expressions. These are found in the [[:if statement]], [[:while statement]] and the [[:loop until statement]]. More on this later. ==== Arithmetic Operators The usual arithmetic operators are available: add, subtract, multiply, divide, unary minus, unary plus. 3.5 + 3 -- 6.5 3 - 5 -- -2 6 * 2 -- 12 7 / 2 -- 3.5 -8.1 -- -8.1 +8 -- +8 Computing a result that is too big (i.e. outside of -1e300 to +1e300) will result in one of the special atoms **+infinity** or **-infinity**. These appear as ##inf## or ##-inf## when you print them out. It is also possible to generate ##nan## or ##-nan##. "nan" means "not a number", i.e. an undefined value (such as ##inf## divided by ##inf##). These values are defined in the IEEE floating-point standard. If you see one of these special values in your output, it usually indicates an error in your program logic, although generating inf as an intermediate result may be acceptable in some cases. For instance, ##1/inf## is ##0##, which may be the "right" answer for your algorithm. Division by zero, as well as bad arguments to math library routines, e.g. square root of a negative number, log of a non-positive number etc. cause an immediate error message and your program is aborted. The only reason that you might use unary plus is to emphasize to the reader of your program that a number is positive. The interpreter does not actually calculate anything for this. ==== Operations on Sequences All of the relational, logical and arithmetic operators described above, as well as the math routines described in [[:Language Reference]], can be applied to sequences as well as to single numbers (atoms). When applied to a sequence, a unary (one operand) operator is actually applied to each element in the sequence to yield a sequence of results of the same length. If one of these elements is itself a sequence then the same rule is applied again recursively. e.g. x = -{1, 2, 3, {4, 5}} -- x is {-1, -2, -3, {-4, -5}} If a binary (two-operand) operator has operands which are both sequences then the two sequences must be of the same length. The binary operation is then applied to corresponding elements taken from the two sequences to get a sequence of results. e.g. x = {5, 6, 7, 8} + {10, 10, 20, 100} -- x is {15, 16, 27, 108} x = {{1, 2, 3}, {4, 5, 6}} + {-1, 0, 1} -- ERROR: 2 != 3 -- but x = {{1, 2, 3} + {-1, 0, 1}, {4, 5, 6} + {-1, 0, 1}} -- CORRECT -- x is {{0, 2, 4}, {3, 5, 7}} If a binary operator has one operand which is a sequence while the other is a single number (atom) then the single number is effectively repeated to form a sequence of equal length to the sequence operand. The rules for operating on two sequences then apply. Some examples: y = {4, 5, 6} w = 5 * y -- w is {20, 25, 30} x = {1, 2, 3} z = x + y -- z is {5, 7, 9} z = x < y -- z is {1, 1, 1} w = {{1, 2}, {3, 4}, {5}} w = w * y -- w is {{4, 8}, {15, 20}, {30}} w = {1, 0, 0, 1} and {1, 1, 1, 0} -- {1, 0, 0, 0} w = not {1, 5, -2, 0, 0} -- w is {0, 0, 0, 1, 1} w = {1, 2, 3} = {1, 2, 4} -- w is {1, 1, 0} -- note that the first '=' is assignment, and the -- second '=' is a relational operator that tests -- equality **Note:** When you wish to compare two strings (or other sequences), you should **not** (as in some other languages) use the '=' operator: if "APPLE" = "ORANGE" then -- ERROR! '##=##' is treated as an operator, just like '##+##', '##*##' etc., so it is applied to corresponding sequence elements, and the sequences must be the same length. When they are equal length, the result is a sequence of ones an zeros. When they are not equal length, the result is an error. Either way you'll get an error, since an if-condition must be an atom, not a sequence. Instead you should use the ##equal## built-in routine: if equal("APPLE", "ORANGE") then -- CORRECT In general, you can do relational comparisons using the ##compare## built-in routine: if compare("APPLE", "ORANGE") = 0 then -- CORRECT You can use ##compare## for other comparisons as well: if compare("APPLE", "ORANGE") < 0 then -- CORRECT -- enter here if "APPLE" is less than "ORANGE" (TRUE) Especially useful is the idiom ##compare(x, "") = 1## to determine whether ##x## is a non empty sequence. ##compare(x, "") = -1## would test for ##x## being an atom, but ##atom(x) = 1## does the same faster and is clearer to read. ==== Subscripting of Sequences A single element of a sequence may be selected by giving the element number in square brackets. Element numbers start at 1. Non-integer subscripts are rounded down to an integer. For example, if ##x## contains ##{5, 7.2, 9, 0.5, 13}## then ##x[2]## is ##7.2##. Suppose we assign something different to ##x[2]##: x[2] = {11,22,33} Then ##x## becomes: ##{5, {11,22,33}, 9, 0.5, 13}##. Now if we ask for ##x[2]## we get ##{11,22,33}## and if we ask for ##x[2][3]## we get the ##atom## 33. If you try to subscript with a number that is outside of the range ##1## to the number of elements, you will get a subscript error. For example ##x[0]##, ##x[-99]## or ##x[6]## will cause errors. So will ##x[1][3]## since ##x[1]## is not a sequence. There is no limit to the number of subscripts that may follow a variable, but the variable must contain sequences that are nested deeply enough. The two dimensional array, common in other languages, can be easily represented with a sequence of sequences: x = { {5, 6, 7, 8, 9}, -- x[1] {1, 2, 3, 4, 5}, -- x[2] {0, 1, 0, 1, 0} -- x[3] } where we have written the numbers in a way that makes the structure clearer. An expression of the form x[i][j] can be used to access any element. The two dimensions are not symmetric however, since an entire "row" can be selected with x[i], but you need to use [[:vslice]] in the Standard Library to select an entire column. Other logical structures, such as n-dimensional arrays, arrays of strings, structures, arrays of structures etc. can also be handled easily and flexibly: 3-D array: y = { {{1,1}, {3,3}, {5,5}}, {{0,0}, {0,1}, {9,1}}, {{-1,9},{1,1}, {2,2}} } -- y[2][3][1] is 9 Array of strings: s = {"Hello", "World", "Euphoria", "", "Last One"} -- s[3] is "Euphoria" -- s[3][1] is 'E' A Structure: employee = { {"John","Smith"}, 45000, 27, 185.5 } To access "fields" or elements within a structure it is good programming style to make up an enum that names the various fields. This will make your program easier to read. For the example above you might have: enum NAME, SALARY, AGE, WEIGHT enum FIRST_NAME, LAST_NAME employees = { {{"John","Smith"}, 45000, 27, 185.5}, -- a[1] {{"Bill","Jones"}, 57000, 48, 177.2}, -- a[2] -- .... etc. } -- employees[2][SALARY] would be 57000. The ##length## built-in function will tell you how many elements are in a sequence. So the last element of a sequence ##s##, is: s[length(s)] A short-hand for this is: s[$] Similarly, s[length(s)-1] can be simplified to: s[$-1] The ##$## may only appear between square braces and it equals the length of the sequence that is being subscripted. Where there's nesting, e.g.: s[$ - t[$-1] + 1] The first ##$## above refers to the length of ##s##, while the second ##$## refers to the length of ##t## (as you'd probably expect). An example where ##$## can save a lot of typing, make your code clearer, and probably even faster is: longname[$][$] -- last element of the last element Compare that with the equivalent: longname[length(longname)][length(longname[length(longname)])] **Subscripting and function side-effects:** In an assignment statement, with left-hand-side subscripts: lhs_var[lhs_expr1][lhs_expr2]... = rhs_expr The expressions are evaluated, and any subscripting is performed, from left to right. It is possible to have function calls in the right-hand-side expression, or in any of the left-hand-side expressions. If a function call has the side-effect of modifying the lhs_var, it is not defined whether those changes will appear in the final value of the lhs_var, once the assignment has been completed. To be sure about what is going to happen, perform the function call in a separate statement, i.e. do not try to modify the lhs_var in two different ways in the same statement. Where there are no left-hand-side subscripts, you can always assume that the final value of the lhs_var will be the value of rhs_expr, regardless of any side-effects that may have changed lhs_var. **Euphoria data structures are almost infinitely flexible.** Arrays in many languages are constrained to have a fixed number of elements, and those elements must all be of the same type. Euphoria eliminates both of those restrictions by defining all arrays (sequences) as a list of zero or more Euphoria objects whose element count can be changed at any time. You can easily add a new structure to the employee sequence above, or store an unusually long name in the NAME field and Euphoria will take care of it for you. If you wish, you can store a variety of different employee "structures", with different sizes, all in one sequence. However, when you retrieve a sequence element, it is not guaranteed to be of any type. You, as a programmer, need to check that the retrieved data is of the type you'd expect, Euphoria will not. The only thing it will check is whether an assignment is legal. For example, if you try to assign a sequence to an integer variable, Euphoria will complain at the time your code does the assignment. Not only can a Euphoria program represent all conventional data structures but you can create very useful, flexible structures that would be hard to declare in many other languages. Note that expressions in general may not be subscripted, just variables. For example: ##{5+2,6-1,7*8,8+1}[3]## is //not// supported, nor is something like: ##date()[MONTH]##. You have to assign the sequence returned by ##date## to a variable, then subscript the variable to get the month. ==== Slicing of Sequences A sequence of consecutive elements may be selected by giving the starting and ending element numbers. For example if ##x## is ##{1, 1, 2, 2, 2, 1, 1, 1}## then ##x[3..5]## is the sequence ##{2, 2, 2}##. ##x[3..3]## is the sequence ##{2}##. ##x[3..2]## is also allowed. It evaluates to the zero length sequence ##{}##. If ##y## has the value: ##{"fred", "george", "mary"}## then ##y[1..2]## is ##{"fred", "george"}##. We can also use slices for overwriting portions of variables. After ##x[3..5] = {9, 9, 9}## ##x## would be ##{1, 1, 9, 9, 9, 1, 1, 1}##. We could also have said ##x[3..5] = 9## with the same effect. Suppose ##y## is ##{0, "Euphoria", 1, 1}##. Then ##y[2][1..4]## is ##"Euph"##. If we say ##y[2][1..4] = "ABCD"## then ##y## will become ##{0, "ABCDoria", 1, 1}##. In general, a variable name can be followed by 0 or more subscripts, followed in turn by 0 or 1 slices. Only variables may be subscripted or sliced, not expressions. We need to be a bit more precise in defining the rules for **empty slices**. Consider a slice ##s[i..j]## where ##s## is of length ##n##. A slice from ##i## to ##j##, where ##j = i - 1## and ##i >= 1## produces the [[:emptyseq "empty sequence"]], even if ##i = n + 1##. Thus ##1..0## and ##n + 1..n## and everything in between are legal **(empty) slices**. Empty slices are quite useful in many algorithms. A slice from ##i## to ##j## where ##j < i - 1## is illegal , i.e. "reverse" slices such as ##s[5..3]## are not allowed. We can also use the ##$## shorthand with slices, e.g. s[2..$] s[5..$-2] s[$-5..$] s[$][1..floor($/2)] -- first half of the last element of s ==== Concatenation of Sequences and Atoms - The '&' Operator ==== @[amp concat|] @[amp_concat|] Any two objects may be concatenated using the **&** operator. The result is a sequence with a length equal to the sum of the lengths of the concatenated objects. e.g. {1, 2, 3} & 4 -- {1, 2, 3, 4} 4 & 5 -- {4, 5} {{1, 1}, 2, 3} & {4, 5} -- {{1, 1}, 2, 3, 4, 5} x = {} y = {1, 2} y = y & x -- y is still {1, 2} You can delete element ##i## of any sequence s by concatenating the parts of the sequence before and after ##i##: s = s[1..i-1] & s[i+1..length(s)] This works even when ##i## is ##1## or ##length(s)##, since ##s[1..0]## is a legal empty slice, and so is ##s[length(s)+1..length(s)]##. ==== Sequence-Formation Finally, sequence-formation, using braces and commas: {a, b, c, ... } is also an operator. It takes n operands, where ##n## is ##0## or more, and makes an n-element sequence from their values. e.g. x = {apple, orange*2, {1,2,3}, 99/4+foobar} The sequence-formation operator is listed at the bottom of the a [[:precedence chart]]. ==== Multiple Assignment Special sequence notation on the left hand side of an assignment can be made to assign to multiple variables with a single statement. This can be useful for using functions that return multiple values in a sequence, such as ##[[:value]]##. atom success, val { success, val } = value( "100" ) -- success = GET_SUCCESS -- val = 100 It is also possible to ignore some of the values in the right hand side. Any elements beyond the number supplied on the left hand side are ignored. Other values can also be ignored by using a question mark ('##?##') instead of a variable name: { ?, val } = value( "100" ) Variables may only appear once on the left hand side, however, they may appear on both the left and right hand side. For instance, to swap the values of two variables: { a, b } = { b, a } ==== Other Operations on Sequences Some other important operations that you can perform on sequences have English names, rather than special characters. These operations are built-in to **eui.exe/euiw.exe**, so they'll always be there, and so they'll be fast. They are described in detail in the [[:Language Reference]], but are important enough to Euphoria programming that we should mention them here before proceeding. You call these operations as if they were subroutines, although they are actually implemented much more efficiently than that. ===== length(sequence s) Returns the length of a sequence s. This is the number of elements in s. Some of these elements may be sequences that contain elements of their own, but ##length## just gives you the "top-level" count. Note however that the length of an atom is always ##1##. e.g. length({5,6,7}) -- 3 length({1, {5,5,5}, 2, 3}) -- 4 (not 6!) length({}) -- 0 length(5) -- 1 ===== repeat(object o1, integer count) Returns a sequence that consists of an item repeated count times. e.g. repeat(0, 100) -- {0,0,0,...,0} i.e. 100 zeros repeat("Hello", 3) -- {"Hello", "Hello", "Hello"} repeat(99,0) -- {} The item to be repeated can be any atom or sequence. ===== append(sequence s1, object o1) Returns a sequence by adding an object o1 to the end of a sequence s1. append({1,2,3}, 4) -- {1,2,3,4} append({1,2,3}, {5,5,5}) -- {1,2,3,{5,5,5}} append({}, 9) -- {9} The length of the new sequence is always 1 greater than the length of the original sequence. The item to be added to the sequence can be any atom or sequence. ===== prepend(sequence s1, object o1) Returns a new sequence by adding an element to the beginning of a sequence s. e.g. append({1,2,3}, 4) -- {1,2,3,4} prepend({1,2,3}, 4) -- {4,1,2,3} append({1,2,3}, {5,5,5}) -- {1,2,3,{5,5,5}} prepend({}, 9) -- {9} append({}, 9) -- {9} The length of the new sequence is always one greater than the length of the original sequence. The item to be added to the sequence can be any atom or sequence. These two built-in functions, ##append## and ##prepend##, have some similarities to the concatenate operator, ##&##, but there are clear differences. e.g. -- appending a sequence is different append({1,2,3}, {5,5,5}) -- {1,2,3,{5,5,5}} {1,2,3} & {5,5,5} -- {1,2,3,5,5,5} -- appending an atom is the same append({1,2,3}, 5) -- {1,2,3,5} {1,2,3} & 5 -- {1,2,3,5} ===== insert(sequence in_what, object what, atom position) This function takes a target sequence, in_what, shifts its tail one notch and plugs the object what in the hole just created. The modified sequence is returned. For instance: s = insert("Joe",'h',3) -- s is "Johe", another string s = insert("Joe","h",3) -- s is {'J','o',{'h'},'e'}, not a string s = insert({1,2,3},4,-0.5) -- s is {4,1,2,3}, like prepend() s = insert({1,2,3},4,8.5) -- s is {1,2,3,4}, like append() The length of the returned sequence is one more than the one of ##in_what##. This is the same rule as for ##append## and ##prepend## above, which are actually special cases of ##insert##. ===== splice(sequence in_what, object what, atom position) If what is an ##atom##, this is the same as ##insert##. But if what is a sequence, that sequence is inserted as successive elements into ##in_what## at ##position##. Example: s = splice("Joe",'h',3) -- s is "Johe", like insert() s = splice("Joe","hn Do",3) -- s is "John Doe", another string s = splice("Joh","n Doe",9.3) -- s is "John Doe", like with the & operator s = splice({1,2,3},4,-2) -- s is {4,1,2,3}, like with the & operator in reversed order The length of ##splice(in_what, what, position)## always is ##length(in_what) + length(what)##, like for concatenation using ##&##. === Precedence Chart When two or more operators follow one another in an expression, there must be rules to tell in which order they should be evaluated, as different orders usually lead to different results. It is common and convenient to use a **precedence order** on operators. Operators with the highest degree of precedence are evaluated first, then those with highest precedence among what remains, and so on. The precedence of operators in expressions is as follows: **highest precedence** {{{ **highest precedence** function/type calls unary- unary+ not * / + - & < > <= >= = != and or xor }}} **lowest precedence** {{{ { , , , } }}} Thus ##2+6*3## means ##2+(6*3)## rather than ##(2+6)*3##. Operators on the same line above have equal precedence and are evaluated left to right. You can force any order of operations by placing round brackets ##( )## around an expression. For instance, ##6/3*5## is ##2*5##, not ##6/15##. Different languages or contexts may have slightly different precedence rules. You should be careful when translating a formula from a language to another; Euphoria is no exception. Adding superfluous parentheses to explicitly denote the exact order of evaluation does not cost much, and may help either readers used to some other precedence chart or translating to or from another context with slightly different rules. Watch out for ##and## and ##or##, or ##*## and ##/##. The equals symbol ##'='## used in an [[:assignment statement]] is not an operator, it's just part of the syntax of the language. %%output=lang_decl == Declarations :<> === Identifiers **Identifiers**, which encompass all explicitly declared variable, constant or routine names, may be of any length. Upper and lower case are distinct. Identifiers must start with a letter or underscore and then be followed by any combination of letters, digits and underscores. The following **reserved words** have special meaning in Euphoria and cannot be used as identifiers: !! tom ... colored links not working !!@@(k <>)@ !!@@(b <>)@ !! !! |$$(k and )|$$(k export )|$$(k public )| !! |$$(k as )|$$(k fallthru )|$$(k retry )| !! |$$(k break )|$$(k for )|$$(k return )| !! |$$(k by )|$$(k function )|$$(k routine )| !! |$$(k case )|$$(k global )|$$(k switch )| !! |$$(k constant )|$$(k goto )|$$(k then )| !! |$$(k continue )|$$(k if )|$$(k to )| !! |$$(k do )|$$(k ifdef )|$$(k type )| !! |$$(k else )|$$(k include )|$$(k until )| !! |$$(k elsedef )|$$(k label )|$$(k while )| !! |$$(k elsif )|$$(k loop )|$$(k with )| !! |$$(k elsifdef )|$$(k namespace )|$$(k without )| !! |$$(k end )|$$(k not )|$$(k xor )| !! |$$(k entry )|$$(k or )|| !! |$$(k enum )|$$(k override )|| !! |$$(k exit )|$$(k procedure )|| !!@@(k <>)@ !!@@(b <>)@ and export public as fallthru retry break for return by function routine case global switch constant goto then continue if to do ifdef type else include until elsedef label while elsif loop with elsifdef namespace without end not xor entry or enum override exit procedure For example, the ##edx## editor displays these words in blue. The following are Euphoria built-in routines. It is best if you do not use these for your own identifiers: abort getenv peek4s system and_bits gets peek4u system_exec append hash peeks tail arctan head platform tan atom include_paths poke task_clock_start c_func insert poke2 task_clock_stop c_proc integer poke4 task_create call length position task_list call_func log power task_schedule call_proc machine_func prepend task_self clear_screen machine_proc print task_status close match printf task_suspend command_line match_from puts task_yield compare mem_copy rand time cos mem_set remainder trace date not_bits remove xor_bits delete object repeat ? delete_routine open replace & equal option_switches routine_id $ find or_bits sequence find_from peek sin floor peek_string splice get_key peek2s sprintf getc peek2u sqrt Identifiers can be used in naming the following: * procedures * functions * types * variables * constants * enums @[end|] ==== procedures These perform some computation and may contain a list of parameters, e.g. procedure empty() end procedure procedure plot(integer x, integer y) position(x, y) puts(1, '*') end procedure There are a fixed number of named parameters, but this is not restrictive since any parameter could be a variable-length sequence of arbitrary objects. In many languages variable-length parameter lists are impossible. In C, you must set up strange mechanisms that are complex enough that the average programmer cannot do it without consulting a manual or a local guru. A copy of the value of each argument is passed in. The formal parameter variables may be modified inside the procedure but this does not affect the value of the arguments. Pass by reference can be achieved using indexes into some fixed sequence. ;**Performance Note~:** :The interpreter does not actually copy sequences or floating-point numbers unless it becomes necessary. For example, y = {1,2,3,4,5,6,7,8.5,"ABC"} x = y The statement ##x = y## does not actually cause a new copy of ##y## to be created. Both ##x## and ##y## will simply "point" to the same sequence. If we later perform ##x[3] = 9##, then a separate sequence will be created for ##x## in memory (although there will still be just one shared copy of ##8.5## and ##"ABC"##). The same thing applies to "copies" of arguments passed in to subroutines. For a number of procedures or functions~--see below~--some parameters may have the same value in many cases. The most expected value for any parameter may be given a default value. To pass the default value, use a question mark ##?##, or omit the value. When the parameter is not the last in the list to the routine, you should use the ##?## for clarity, rather than simply omitting the parameter, and having consecutive commas. procedure foo(sequence s, integer n=1) ? n + length(s) end procedure foo("abc") -- prints out 4 = 3 + 1. n was not specified, so was set to 1. foo("abc", ? ) -- prints out 4 = 3 + 1. n was not specified, so was set to 1. foo("abc", 3) -- prints out 6 = 3 + 3 This is not limited to the last parameter(s): procedure bar(sequence s="abc", integer n, integer p=1) ? length(s)+n+p end procedure bar(?, 2) -- prints out 6 = 3 + 2 + 1 bar(, 2) -- prints out 6 = 3 + 2 + 1. Legal, but considered bad form. bar(2) -- errors out, as 2 is not a sequence bar(?, 2, ?) -- same as bar(,2) bar(?, 2, 3) -- prints out 8 = 3 + 22 + 3 bar({}, 2, ?) -- prints out 3 = 0 + 2 + 1 bar() -- errors out, second parameter is omitted, -- but doesn't have a default value Any expression may be used in a default value. Parameters that have been already mentioned may even be part of the expression: procedure baz(sequence s, integer n=length(s)) ? n end procedure baz("abcd") -- prints out 4 ==== functions These are just like procedures, but they return a value, and can be used in an expression, e.g. function max(atom a, atom b) if a >= b then return a else return b end if end function ==== return statement Any Euphoria object can be returned. You can, in effect, have multiple return values, by returning a sequence of objects. e.g. return {x_pos, y_pos} However, Euphoria does not have variable lists. When you return a sequence, you still have to dispatch its contents to variables as needed. And you cannot pass a sequence of parameters to a routine, unless using [[:call_func]] or [[:call_proc]], which carries a performance penalty. We will use the general term "subroutine", or simply "routine" when a remark is applicable to both procedures and functions. Defaulted parameters can be used in functions exactly as they are in procedures. See the section above for a few examples. ==== types These are special functions that may be used in declaring the allowed values for a variable. A type must have exactly one parameter and should return an atom that is either true (non-zero) or false (zero). Types can also be called just like other functions. See [[:Specifying the Type of a variable]]. Although there are no restrictions to using defaulted parameters with types, their use is so much constrained by a type having exactly one parameter that they are of little practical help there. You cannot use a type to perform any adjustment to the value being checked, if only because this value may be the temporary result of an expression, not an actual variable. ==== variables These may be assigned values during execution e.g. -- x may only be assigned integer values integer x x = 25 -- a, b and c may be assigned *any* value object a, b, c a = {} b = a c = 0 When you declare a variable you name the variable (which protects you against making spelling mistakes later on) and you define which sort of values may legally be assigned to the variable during execution of your program. The simple act of declaring a variable does not assign any value to it. If you attempt to read it before assigning any value to it, Euphoria will issue a run-time error as "variable xyz has never been assigned a value". To guard against forgetting to initialize a variable, and also because it may make the code clearer to read, you can combine declaration and assignment: integer n = 5 This is equivalent to integer n n = 5 It is not infrequent that one defines a private variable that bears the same name as one already in scope. You can reuse the value of that variable when performing an initialization on declare by using a default namespace for the current file: namespace app integer n n=5 procedure foo() integer n = app:n + 2 ? n end procedure foo() -- prints out 7 ==== constants These are variables that are assigned an initial value that can never change e.g. constant MAX = 100 constant Upper = MAX - 10, Lower = 5 constant name_list = {"Fred", "George", "Larry"} The result of any expression can be assigned to a constant, even one involving calls to previously defined functions, but once the assignment is made, the value of the constant variable is "locked in". Constants may not be declared inside a subroutine. ==== enum An enumerated value is a special type of constant where the first value defaults to the number 1 and each item after that is incremented by 1 by default. An optional ##by## keyword can be supplied to change the increment value. As with sequences, enums can also be terminated with a ##$## for ease of editing ##enum## lists that may change frequently during development. enum ONE, TWO, THREE, FOUR -- ONE is 1, TWO is 2, THREE is 3, FOUR is 4 You can change the value of any one item by assigning it a numeric value. Enums can only take numeric values. You cannot set the starting value to an expression or other variable. Subsequent values are always the previous value plus one, unless they too are assigned a default value. enum ONE, TWO, THREE, ABC=10, DEF, XYZ -- ONE is 1, TWO is 2, THREE is 3 -- ABC is 10, DEF is 11, XYZ is 12 Euphoria sequences use integer indexes, but with ##enum## you may write code like this: enum X, Y sequence point = { 0,0 } point[X] = 3 point[Y] = 4 By default, unless an enum member is being specifically set to some value, its value will be one more than the previous member's value, with the first default value being ##1##. This default can be overridden. The syntax is: enum by DELTA member1, member2, ... ,memberN where ##'DELTA'## is a literal number with an optional operation code (##*, +, -, /##) preceding it. Examples: enum by 2 A,B,C=6,D --> values are 1,3,6,8 enum by -2 A=10,B,C,D --> values are 10,8,6,4 enum by * 2 A,B,C,D,E --> values are 1,2,4,8,16 enum by / 3 A=81,B,C,D,E --> values are 81,27,9,3,1 Also note that enum members do not have to be integers. enum by / 2 A=5,B,C --> values are 5, 2.5, 1.25 ==== enum type There is also a special form of ##enum##, an //enum type//. This is a simple way to write a user-defined type based on the set of values in a specific enum group. The type created this way can be used anywhere a normal user-defined type can be used. For example, enum type RGBA RED, GREEN, BLUE, ALPHA end type -- Only allow values of RED, GREEN, BLUE, or ALPHA as parameters procedure xyz( RGBA x, RGBA y) -- do stuff... end procedure However there is one significant difference when it comes to enum types. For normal types, when calling the type function, it returns either ##0## or ##1##. The enum type function returns ##0## if the argument is not a member of the enum set, and it returns a non-zero atom when the argument is a member. The value returned might be ##1## or it might not be ##1##. Don't rely on the type returning one. In EUPHORIA any atom that is not ##0## is true. The non-zero value returned differs between 4.0 and 4.1. For example, enum type color RED=4, GREEN=7, BLACK=1, BLUE=3 , PINK=10 end type -- color(RED) --> TRUE but might not be 1. if color(GREEN) then -- good. -- do stuff end if -- color(BLUE) --> also TRUE but might not be 1. if color(BLUE) = 1 then -- BAD, very BAD. -- BLUE is a color but this branch might not get executed. -- do stuff end if -- As a matter of style you may compare to 0. if color(BLACK) != 0 then -- good. Any non-zero is true in EUPHORIA -- So, compare to 0 if you wish. -- do stuff end if enum by DELTA member1, member2, ... ,memberN where ##'DELTA'## is a literal number with an optional operation code (##*, +, -, /##) preceding it. Examples: enum by 2 A,B,C=6,D --> values are 1,3,6,8 enum by -2 A=10,B,C,D --> values are 10,8,6,4 enum by * 2 A,B,C,D,E --> values are 1,2,4,8,16 enum by / 3 A=81,B,C,D,E --> values are 81,27,9,3,1 Also note that enum members do not have to be integers. enum by / 2 A=5,B,C --> values are 5, 2.5, 1.25 === Specifying the type of a variable So far you've already seen some examples of variable types but now we will define types more precisely. Variable declarations have a type name followed by a list of the variables being declared. For example, object a global integer x, y, z procedure fred(sequence q, sequence r) The types: **object**, **sequence**, **atom** and **integer** are **predefined**. Variables of type **object** may take on //any// value. Those declared with type **sequence** must always be sequences. Those declared with type **atom** must always be atoms. Variables declared with type **integer** must be atoms with integer values from ##-1073741824## to ##+1073741823## inclusive. You can perform exact calculations on larger integer values, up to about ##15## decimal digits, but declare them as **atom**, rather than integer. ;**Note~:** :In a procedure or function parameter list like the one for ##fred## above, a type name may only be followed by a single parameter name. ;**Performance Note~:** :Calculations using variables declared as integer will usually be somewhat faster than calculations involving variables declared as atom. If your machine has floating-point hardware, Euphoria will use it to manipulate atoms that are not integers. If your machine doesn't have floating-point hardware (this may happen on old 386 or 486 PCs), Euphoria will call software floating-point arithmetic routines contained in **euid.exe** (or in //Windows//). You can force ##eui.exe## to bypass any floating-point hardware, by setting an environment variable: SET NO87=1 The slower software routines will be used, but this could be of some advantage if you are worried about the floating-point bug in some early Pentium chips. @[udt|] ==== User-defined types To augment the [[:predefined types]], you can create **user-defined types**. All you have to do is define a single-parameter function, but declare it with **type ... end type** instead of **function ... end function**. For example, type hour(integer x) return x >= 0 and x <= 23 end type hour h1, h2 h1 = 10 -- ok h2 = 25 -- error! program aborts with a message Variables ##h1## and ##h2## can only be assigned integer values in the range ##0## to ##23## inclusive. After each assignment to ##h1## or ##h2## the interpreter will call ##hour##, passing the new value. The value will first be checked to see if it is an integer (because of "integer x"). If it is, the return statement will be executed to test the value of ##x## (i.e. the new value of ##h1## or ##h2##). If ##hour## returns true, execution continues normally. If ##hour## returns false then the program is aborted with a suitable diagnostic message. "hour" can be used to declare subroutine parameters as well: procedure set_time(hour h) ##set_time## can only be called with a reasonable value for parameter ##h##, otherwise the program will abort with a message. A variable's type will be checked after each assignment to the variable (except where the compiler can predetermine that a check will not be necessary), and the program will terminate immediately if the type function returns false. Subroutine parameter types are checked each time that the subroutine is called. This checking guarantees that a variable can never have a value that does not belong to the type of that variable. Unlike other languages, the type of a variable does not affect any calculations on the variable, nor the way its contents are displayed. Only the value of the variable matters in an expression. The type just serves as an error check to prevent any "corruption" of the variable. User-defined types can catch unexpected logical errors in your program. They are not designed to catch or correct user input errors. In particular, they cannot adjust a wrong value to some other, presumably legal, one. @[type_check|] Type checking can be turned off or on between subroutines using the with ##type_check## or ##without type_check## (see [[:specialstatements]]). It is initially on by default. ;**Note to Bench markers~:** : When comparing the speed of Euphoria programs against programs written in other languages, you should specify **without type_check** at the top of the file. This gives Euphoria permission to skip run-time type checks, thereby saving some execution time. All other checks are still performed, e.g. subscript checking, uninitialized variable checking etc. Even when you turn off type checking, Euphoria reserves the right to make checks at strategic places, since this can actually allow it to run your program //faster// in many cases. So you may still get a type check failure even when you have turned off type checking. Whether type checking is on or off, you will never get a **//machine-level//** exception. **You will always get a meaningful message from Euphoria when something goes wrong**. (//This might not be the case when you [[:poke]] directly into memory, or call routines written in C or machine code.//) Euphoria's way of defining types is simpler than what you will find in other languages, yet Euphoria provides the programmer with //greater// flexibility in defining the legal values for a type of data. Any algorithm can be used to include or exclude values. You can even declare a variable to be of type object which will allow it to take on //any// value. Routines can be written to work with very specific types, or very general types. For many programs, there is little advantage in defining new types, and you may wish to stick with the four [[:predefined types]]. Unlike other languages, Euphoria's type mechanism is optional. You don't need it to create a program. However, for larger programs, strict type definitions can aid the process of debugging. Logic errors are caught close to their source and are not allowed to propagate in subtle ways through the rest of the program. Furthermore, it is easier to reason about the misbehavior of a section of code when you are guaranteed that the variables involved always had a legal value, if not the desired value. Types also provide meaningful, machine-checkable documentation about your program, making it easier for you or others to understand your code at a later date. Combined with the subscript checking, uninitialized variable checking, and other checking that is always present, strict run-time type checking makes debugging much easier in Euphoria than in most other languages. It also increases the reliability of the final program since many latent bugs that would have survived the testing phase in other languages will have been caught by Euphoria. ;**Anecdote 1~:** : In porting a large C program to Euphoria, a number of latent bugs were discovered. Although this C program was believed to be totally "correct", we found: a situation where an uninitialized variable was being read; a place where element number "-1" of an array was routinely written and read; and a situation where something was written just off the screen. These problems resulted in errors that weren't easily visible to a casual observer, so they had survived testing of the C code. ;**Anecdote 2~:** :The Quick Sort algorithm presented on page 117 of //Writing Efficient Programs// by Jon Bentley has a subscript error! The algorithm will sometimes read the element just //before// the beginning of the array to be sorted, and will sometimes read the element just //after// the end of the array. Whatever garbage is read, the algorithm will still work - this is probably why the bug was never caught. But what if there isn't any (virtual) memory just before or just after the array? Bentley later modifies the algorithm such that this bug goes away~--but he presented this version as being correct. **//Even the experts need subscript checking!//** ;**Performance Note~:** :When typical user-defined types are used extensively, type checking adds only 20 to 40 percent to execution time. Leave it on unless you really need the extra speed. You might also consider turning it off for just a few heavily-executed routines. [[:Profiling]] can help with this decision. ==== integer An Euphoria ##integer## is a mathematical integer restricted to the range ##-1,073,741,824## to ##+1,073,741,823##. As a result, a variable of the integer type, while allowing computations as fast as possible, cannot hold 32-bit machine addresses, even though the latter are mathematical integers. You must use the [[:atom]] type for this purpose. Also, even though the product of two integers is a mathematical integer, it may not fit into an integer, and should be kept in an atom instead. ==== atom An ##atom## can hold three kinds of data: * Mathematical integers in the range ##-power(2,53)## to +##power(2,53)## * Floating point numbers, in the range ##-power(2,1024)+1## to ##+power(2,1024)-1## * Large mathematical integers in the same range, but with a fuzz that grows with the magnitude of the integer. ##power(2,53)## is slightly above 9.10^^15^^, ##power(2,1024)## is in the 10^^308^^ range. Because of these constraints, which arise in part from common hardware limitations, some care is needed for specific purposes: * The sum or product of two integers is an ##atom##, but may not be an ##integer##. * Memory addresses, or handles acquired from anything non Euphoria, including the operating system, **must** be stored as an ##atom##. * For large numbers, usual operations may yield strange results: integer n = power(2, 27) -- ok integer n_plus = n + 1, n_minus = n - 1 -- ok atom a = n * n -- ok atom a1 = n_plus * n_minus -- still ok ? a - a1 -- prints 0, should be 1 mathematically //This is not an Euphoria bug//. The IEEE 754 standard for floating point numbers provides for 53 bits of precision for any real number, and an accurate computation of ##a-a1## would require 54 of them. Intel FPU chips do have 64 bit precision registers, but the low order 16 bits are only used internally, and Intel recommends against using them for high precision arithmetic. Their SIMD machine instruction set only uses the IEEE 754 defined format. ==== sequence A sequence is a type that is a //container//. A sequence has //elements// which can be accessed through their //index//, like in ##my_sequence[3]##. ##sequence##s are so generic as being able to store all sorts of data structures: strings, trees, lists, anything. Accesses to sequences are always bound checked, so that you cannot read or write an element that does not exist, ever. A large amount of extraction and shape change operations on sequences is available, both as built-in operations and library routines. The elements of a sequence can have any type. ##sequence##s are implemented very efficiently. Programmers used to pointers will soon notice that they can get most usual pointer operations done using sequence indexes. The loss in efficiency is usually hard to notice, and the gain in code safety and bug prevention far outweighs it. ==== object This type can hold any data Euphoria can handle, both atoms and sequences. The ##object## type returns 0 if a variable is not initialized, else ##1##. === Scope ==== Why scopes, and what are they? The //scope// of an identifier is the portion of the program where its declaration is in effect, i.e. where that identifier is //visible//. Euphoria has many pre-defined procedures, functions and types. These are defined automatically at the start of any program. For exmaple, the ##edx## editor shows them in magenta. These pre-defined names are not reserved. You can override them with your own variables or routines. It is possible to use a user-defined identifier before it has been declared, provided that it will be declared at some point later in the program. For example, procedures, functions and types can call themselves or one another //recursively//. Mutual recursion, where routine A calls routine B which directly or indirectly calls routine A, implies one of A or B being called before it is defined. This was traditionally the most frequent situation which required using the [[:routine_id]] mechanism, but is now supported directly. See [[:Indirect Routine Calling]] for more details on the [[:routine_id]] mechanism. ==== Defining the scope of an identifier The scope of an identifier is a description of what code can 'access' it. Code in the same scope of an identifier can access that identifier and code not in the same scope cannot access it. The scope of a **variable** depends upon where and how it is declared. * If it is declared within a ##**for**##, ##**while**##, ##**loop**## or ##**switch**##, its scope starts at the declaration and ends at the respective ##**end**## statement. * In an ##**if**## statement, the scope starts at the declaration and ends either at the next ##**else**##, ##**elsif**## or ##**end if**## statement. * If a variable is declared within a routine (known as a private variable) and outside one of the structures listed above, the scope of the variable starts at the declaration and ends at the routine's ##**end**## statement. * If a variable is declared outside of a routine (known as a module variable), and does not have a scope modifier, its scope starts at the declaration and ends at the end of the file it is declared in. The scope of a **constant** that does not have a scope modifier, starts at the declaration and ends at the end of the file it is declared in. The scope of a **enum** that does not have a scope modifier, starts at the declaration and ends at the end of the file it is declared in. The scope of all **procedures**, **functions** and **types**, which do not have a scope modifier, starts at the beginning of the source file and ends at the end of the source file in which they are declared. In other words, these can be accessed by any code in the same file. Constants, enums, module variables, procedures, functions and types, which do not have a scope modifier are referred to as **local**. However, these identifiers can have a scope modifier preceding their declaration, which causes their scope to extend beyond the file they are declared in. * If the keyword **global** precedes the declaration, the scope of these identifiers extends to the whole application. They can be accessed by code anywhere in the application files. * If the keyword **public** precedes the declaration, the scope extends to any file that explicitly includes the file in which the identifier is declared, or to any file that includes a file that in turn ##public include##s the file containing the ##public## declaration. * If the keyword **export** precedes the declaration, the scope only extends to any file that directly includes the file in which the identifier is declared. When you **[[:include]]** a Euphoria file in another file, only the identifiers !!zzzz-----^^^^^^^^^^^^^^ declared using a scope modifier are accessible to the file doing the include. The other declarations in the included file are invisible to the file doing the include, and you will get an error message, "##Errors resolving the following references##", if you try to use them. There is a variant of the **include** statement, called **public include**, which will be discussed later and behaves differently on **public** symbols. Note that **constant** and **enum** declarations must be outside of any subroutine. Euphoria encourages you to restrict the scope of identifiers. If all identifiers were automatically global to the whole program, you might have a lot of naming conflicts, especially in a large program consisting of files written by many different programmers. A naming conflict might cause a compiler error message, or it could lead to a very subtle bug, where different parts of a program accidentally modify the same variable without being aware of it. Try to use the most restrictive scope that you can. Make variables **private** to one routine where possible, and where that is not possible, make them **local** to a file, rather than **global** to the whole program. And whenever an identifier needs to be known from a few files only, make it **public** or **export** so as to hide it from whoever does not need to see it ~-- and might some day define the same identifier. For example: -- sublib.e export procedure bar() ?0 end procedure -- some_lib.e include sublib.e export procedure foo() ?1 end procedure bar() -- ok, declared in sublib.e -- my_app.exw include some_lib.e foo() -- ok, declared in some_lib.e bar() -- error! bar() is not declared here Why not declare ##foo## as global, as it is meant to be used anywhere? Well, one could, but this will increase the risks of name conflicts. This is why, for instance, all public identifiers from the standard library have **public** scope. **global** should be used rarely, if ever. Because earlier versions of Euphoria didn't have **public** or **export**, it has to remain there for a while. One should be very sure of not polluting any foreign file's symbol table before using **global** scope. Built-in identifiers act as if declared as **global** ~-- but they are not declared in any Euphoria coded file. ==== Using namespaces @[namespace|] Euphoria namespaces are used to disambiguate between symbols (routines, variables, constants, etc) with the same names in different files. They may be declared as a default namespace in a file for the convenience of the users of that file, or they may be declared at the point where a file is included. Note that unlike namespaces in some other languages, this does not provide a sandbox around the symbols in the file. It is just an easy way to tell euphoria to look for a symbol in a particular file. Identifiers marked as ##global##, ##public## or ##export## are known as //exposed// variables because they can be used in files other than the one they were declared in. All other identifiers can only be used within their own file. This information is helpful when maintaining or enhancing the file, or when learning how to use the file. You can make changes to the internal routines and variables, without having to examine other files, or notify other users of the include file. Sometimes, when using include files developed by others, you will encounter a naming conflict. One of the include file authors has used the same name for a exposed identifier as one of the other authors. One of way of fixing this, if you have the source, is to simply edit one of the include files to correct the problem, however then you'd have repeat this process whenever a new version of the include file was released. Euphoria has a simpler way to solve this. Using an extension to the include statement, you can say for example: include johns_file.e as john include bills_file.e as bill john:x += 1 bill:x += 2 In this case, the variable ##x## was declared in two different files, and you want to refer to both variables in your file. Using the //namespace identifier// of either ##john## or ##bill##, you can attach a prefix to ##x## to indicate which ##x## you are referring to. We sometimes say that ##john## refers to one //namespace//, while ##bill## refers to another distinct //namespace//. You can attach a namespace identifier to any user-defined variable, constant, procedure or function. You can do it to solve a conflict, or simply to make things clearer. A namespace identifier has local scope. It is known only within the file that declares it, i.e. the file that contains the include statement. Different files might define different namespace identifiers to refer to the same included file. There is a special, reserved namespace, ##**eu**## for referring to built-in Euphoria routines. This can be useful when a built-in routine has been overridden: procedure puts( integer fn, object text ) eu:puts(fn, "Overloaded puts says: "& text ) end procedure puts(1, "Hello, world!\n") eu:puts(1, "Hello, world!\n") Files can also declare a default namespace to be used with the file. When a file with a default namespace is included, if the include statement did not specify a namespace, then the default namespace will be automatically declared in that file. If the include statement declares a namespace for the newly included file, then the specified namespace will be available instead of the default. No two files can use the same namespace identifier. If two files with the same default namespaces are included, at least one will be required to have a different namespace to be specified. To declare a default namespace in a file, the first token (whitespace and comments are ignored) should be 'namespace' followed by the desired name: -- foo.e : this file does some stuff namespace foo A namespace that is declared as part of an ##include## statement is local to the file where the ##include## statement is. A default namespace declared in a file is considered a public symbol in that file. Namespaces and other symbols (e.g., variables, functions, procedures and types) can have the same name without conflict. A namespace declared through an ##include## statement will mask a default namespace declared in another file, just like a normal local variable will mask a public variable in another file. In this case, rather than using the default namespace, declare a new namespace through the ##include## statement. Note that declaring a namespace, either through the include statement or as a default namespace does not **require** that every symbol reference must be qualified with that namespace. The namespace simply **allows** the user to deconflict symbols in different files with the same name, or to allow the programmer to be explicit about where symbols are coming from for the purposes of clarity, or to avoid possible future conflicts. A qualified reference does not absolutely restrict the reference to symbols that actually reside within the specified file. It can also apply to symbols included by that file. This is especially useful for multi-file libraries. Programmers can use a single namespace for the library, even though some of the visible symbols in that library are not declared in the main file: -- lib.e namespace lib public include sublib.e public procedure main() ... -- sublib.e public procedure sub() ... -- app.ex include lib.e lib:main() lib:sub() Now, what happens if you do not use 'public include'? -- lib2.e include sublib.e ... -- app2.ex include lib.e lib:main() lib:sub() -- error. sub() is visible in lib2.e but not in app2.ex ==== The visibility of public and export identifiers When a file needs to see the public or exported identifiers in another file that includes the first file, the first file must include that other (including) file. For example, -- Parent file: foo.e -- public integer Foo = 1 include bar.e -- bar.e needs to see Foo showit() -- execute a routine in bar.e -- Included file: bar.e -- include foo.e -- included so I can see Foo constant xyz = Foo + 1 public procedure showit() ? xyz end procedure //Public// symbols can only be seen by the file that explicitly includes the file where those public symbols are declared. For example, -- Parent file: foo.e -- include bar.e showit() -- execute a public routine in bar.e If however, a file wants a third file to also see the symbols that it can, it needs to do a ##public include##. For example, -- Parent file: foo.e -- public include bar.e showit() -- execute a public routine in bar.e public procedure fooer() . . . end procedure -- Appl file: runner.ex -- include foo.e showit() -- execute a public routine that foo.e can see in bar.e fooer() -- execute a public routine in foo.e The ##public include## facility is designed to make having a library composed of multiple files easy for an application to use. It allows the main library file to expose symbols in files that //it// includes as if the application had actually included them. That way, symbols meant for the end user can be declared in files other than the main file, and the library can still be organized however the author prefers without affecting the end user. **Another example**\\ Given that we have two files LIBA.e and LIBB.e ... > -- LIBA.e -- public constant foo1 = 1, foo2 = 2 export function foobarr1() return 0 end function export function foobarr2() return 0 end function < and > -- LIBB.e -- -- I want to pass on just the constants not -- the functions from LIBA.e. public include LIBA.e < The export scope modifier is used to limit the extent that symbols can be accessed. It works just like ##public## except that ##export## symbols are only ever passed up one level only. In other words, if a file wants to use an ##export## symbol, that file must include it explicitly. In this example above, code in LIBB.e can see both the public and export symbols declared in LIBA.e (##foo1, foo2 foobarr1## and ##foobarr2##) because it explicitly includes LIBA.e. And by using the ##public## prefix on the ##include## of LIBA.e, it also allows any file that ##includes## LIBB.e to the ##public## symbols from LIBA.e but they will not see any ##export## symbols declared in LIBA.e. In short, a ##public include## is used expose ##public## symbols that are included, up one level but not any ##export## symbols that were include. ==== The complete set of resolution rules **Resolution** is the process by which the interpreter determines which specific symbol will actually be used at any given point in the code. This is usually quite easy as most symbol names in a given scope are unique and so Euphoria does not have to choose between them. However, when the same symbol name is used in different but enclosing scopes, Euphoria has to make a decision about which symbol the coder is referring to. When Euphoria sees an identifier name being used, it looks for the name's declaration starting from the current scope and moving outwards through the enclosing scopes until the name's declaration is found. The hierarchy of scopes can be viewed like this ... {{{ global/public/export file routine block 1 block 2 ... block n }}} So, if a name is used at a ##block## level, Euphoria will first check for its declaration in the same block, and if not found will check the enclosing blocks until it reaches the routine level, in which case it checks the routine (including parameter names), and then check the file that the block is declared in and finally check the global/public/export symbols. By the way, Euphoria will not allow a name to be declared if it is already declared in the same scope, or enclosing ##block## or enclosing ##routine##. Thus the following examples are illegal... integer a if x then integer a -- redefinition not allowed. end if if x then integer a if y then integer a -- redefinition not allowed. end if end if procedure foo(integer a) if x then integer a -- redefinition not allowed. end if end procedure But note that this below is valid ... integer a = 1 procedure foo() integer a = 2 ? a end procedure ? a In this situation, the second declaration of 'a' is said to //shadow// the first one. The output from this example will be ... > {{{ 2 1 }}} Symbols all declared in the same file (be they in blocks, routines or at the file level) are easy to check by Euphoria for scope clashes. However, a problem can arise when symbol names declared as global/public/export in different files are placed in the same scope during ##include## processing. As it is quite possible for these files to come from independent developers that are not aware of each other's symbol names, the potential for name clashes is high. A name clash is just when the same name is declared in the same scope but in different files. Euphoria cannot generally decide which name you were referring to when this happens, so it needs you help to resolve it. This is where the ##namespace## concept is used. A namespace is just a name that you assign to an include file so that your code can exactly specify where an identifier that your code is using actually comes from. Using a namespace with an identifier, for example: include somefile.e as my_lib include another.e my_lib:foo() enables Euphoria to resolve the identifier (##foo##) as explicitly coming from the file associated with the namespace "my_lib". This means that if ##foo## was also declared as global/public/export in //another.e// then that ##foo## would be ignored and the ##foo## in //somefile.e// would be used instead. Without that namespace, Euphoria would have complained (##Errors resolving the following references:##) If you need to use both ##foo## symbols you can still do that by using two different namespaces. For example: include somefile.e as my_lib include another.e as her_ns my_lib:foo() -- Calls the one in somefile.e her_ns:foo() -- Calls the one in another.e Note that there is a reserved namespace name that is always in use. The special namespace **##eu##** is used to let Euphoria know that you are accessing a built-in symbol rather than one of the same name declared in someone's file. For example... include somefile.e as my_lib result = my_lib:find(something) -- Calls the 'find' in somefile.e xy = eu:find(X, Y) -- Calls Euphoria's built-in 'find' The controlling variable used in a [[:for statement]] is special. It is automatically declared at the beginning of the loop block, and its scope ends at the end of the for-loop. If the loop is inside a function or procedure, the loop variable cannot have the same name as any other variable declared in the routine or enclosing block. When the loop is at the top level, outside of any routine, the loop variable cannot have the same name as any other file-scoped variable. You can use the same name in many different for-loops as long as the loops are not nested. You do not declare loop variables as you would other variables because they are automatically declared as atoms. The range of values specified in the for statement defines the legal values of the loop variable. Variables declared inside other types of blocks, such as a **loop**, **while**, **if** or **switch** statement use the same scoping rules as a for-loop index. @[override|] ==== The override qualifier There are times when it is necessary to replace a global, public or export identifier. Typically, one would do this to extend the capabilities of a routine. Or perhaps to supersede the user defined type of some public, export or global variable, since the type itself may not be global. This can be achieved by declaring the identifier as **override**: override procedure puts(integer channel,sequence text) eu:puts(log_file, text) eu:puts(channel, text) end procedure A warning will be issued when you do this, because it can be very confusing, and would probably break code, for the new routine to change the behavior of the former routine. Code that was calling the former routine expects no difference in service, so there should not be any. If an identifier is declared global, public or export, but not override, and there is a built-in of the same name, Euphoria will not assume an override, and will choose the built-in. A warning will be generated whenever this happens. @[deprecate|] === Deprecation Beginning in Euphoria 4.1, procedures and functions can be marked as deprecated. Deprecation is a computer software term that assigns a status to a particular item to indicate that it should be avoided, typically because it has been superseded. Deprecated routines remain in the language or library but should be avoided. The ##deprecate## modifier will cause a warning to appear if that routine is used. It serves no more purpose but is a powerful way to keep an evolving library clean, slim and fit for the task. Instead of simply removing an old routine authors are encouraged to use the ##deprecate## modifier on a routine and leave it a part of the library for at least one major version increment. It can then be removed. This allows your users time to upgrade their code to the new recommended routine. Deprecated routines should be included in your manual, state when and why they were deprecated and what is the path future for accomplishing the same task. --** -- Say hello to someone -- -- Parameters: -- * name - name of person to say hello to -- -- Deprecated: -- ##say_hello## has been deprecated in favor of the new greet routine. -- deprecate public procedure say_hello(sequence name) printf(1, "Hello, %s\n", { name }) end procedure public procedure greet(sequence name="World", sequence greeting="Hello") printf(1, "%s, %s\n", { greeting, name }) end procedure When deprecating a routine, the keyword ##deprecate## should occur before any scope modifier. %%output=lang_assignment == Assignment statement :<> An **assignment statement** assigns the value of an expression to a simple variable, or to a subscript or slice of a variable. e.g. x = a + b y[i] = y[i] + 1 y[i..j] = {1, 2, 3} The previous value of the variable, or element(s) of the subscripted or sliced variable are discarded. For example, suppose x was a 1000-element sequence that we had initialized with: object x x = repeat(0, 1000) -- a sequence of 1000 zeros and then later we assigned an atom to x with: x = 7 This is perfectly legal since x is declared as an **object**. The previous value of x, namely the 1000-element sequence, would simply disappear. Actually, the space consumed by the 1000-element sequence will be automatically recycled due to Euphoria's dynamic storage allocation. Note that the equals symbol '=' is used for both assignment and for equality testing. There is never any confusion because an assignment in Euphoria is a statement only, it can't be used as an expression (as in C). === Assignment with Operator Euphoria also provides some additional forms of the assignment statement. To save typing, and to make your code a bit neater, you can combine assignment with one of the operators: + - / * & For example, instead of saying: mylongvarname = mylongvarname + 1 You can say: mylongvarname += 1 Instead of saying: galaxy[q_row][q_col][q_size] = galaxy[q_row][q_col][q_size] * 10 You can say: galaxy[q_row][q_col][q_size] *= 10 and instead of saying: accounts[start..finish] = accounts[start..finish] / 10 You can say: accounts[start..finish] /= 10 In general, whenever you have an assignment of the form: {{{ left-hand-side = left-hand-side op expression }}} You can say: {{{ left-hand-side op= expression }}} where **//op//** is one of: + - * / & When the left-hand-side contains multiple subscripts/slices, the ##op=## form will usually execute faster than the longer form. When you get used to it, you may find the ##op=## form to be slightly more readable than the long form, since you don't have to visually compare the left-hand-side against the copy of itself on the right side. You cannot use assignment with operators while declaring a variable, because that variable is not initialized when you perform the assignment. %%output=lang_branch == Branching Statements :<> @[then|] @[else|] @[elsif|] === if statement An **if statement** tests a condition to see whether it is true or false, and then depending on the result of that test, executes the appropriate set of statements. The syntax of ##if## is IFSTMT ==: IFTEST [ ELSIF ...] [ELSE] ENDIF IFTEST ==: if ATOMEXPR [ LABEL ] then [ STMTBLOCK ] ELSIF ==: elsif ATOMEXPR then [ STMTBLOCK ] ELSE ==: else [ STMTBLOCK ] ENDIF ==: end if **Description of syntax**\\ * An //if statement// consists of the keyword ##**if**##, followed by an //expression// that evaluates to an atom, optionally followed by a //label// clause, followed by the keyword ##**then**##. Next is a set of zero or more statements. This is followed by zero or more //elsif// clauses. Next is an optional //else// clause and finally there is the keyword ##**end**## followed by the keyword ##**if**##. * An //elsif// clause consists of the key word ##**elsif**##, followed by an //expression// that evaluates to an atom, followed by the keyword ##**then**##. Next is a set of zero or more statements. * An //else// clause consists of the keyword ##**else**## followed by a set of zero or more statements. In Euphoria, //false// is represented by an atom whose value is zero and //true// is represented by an atom that has any non-zero value. * When an //expression// being tested is true, Euphoria executes the statements immediately following the ##**then**## keyword after the //expression//, up to the corresponding ##**elsif**## or ##**else**##, whichever comes next, then skips down to the corresponding ##**end if**##. * When an //expression// is false, Euphoria skips over any statements until it comes to the next corresponding ##**elsif**## or ##**else**##, whichever comes next. If this is an ##**elsif**## then its //expression// is tested otherwise any statements following the ##**else**## are executed. For example: if a < b then x = 1 end if if a = 9 and find(0, s) then x = 4 y = 5 else z = 8 end if if char = 'a' then x = 1 elsif char = 'b' or char = 'B' then x = 2 elsif char = 'c' then x = 3 else x = -1 end if Notice that ##**elsif**## is a contraction of //else if//, but it's cleaner because it does not require an ##**end if**## to go with it. There is just one ##**end if**## for the entire //if statement//, even when there are many ##**elsif**## clauses contained in it. The ##**if**## and ##**elsif**## expressions are tested using [[:short_circuit]] evaluation. An //if statement// can have a //label clause// just before the first ##**then**## keyword. See the section on [[:Header Labels]]. Note that an //elsif clause// can not have a label. @[case|] @[do|] === switch statement === The switch statement is used to run a specific set of statements, depending on the value of an expression. It often replaces a set of if-elsif statements due to it's ability to be highly optimized, thus much greater performance. There are some key differences, however. A switch statement operates upon the value of a single expression, and the program flow continues based upon defined cases. The syntax of a switch statement: switch [with fallthru] [label ""] do case [, , ...] then [code block] [[break [label]]|fallthru] case [, , ...] then [code block] [[break [label]]|fallthru] case [, , ...] then [code block] [[break [label]]|fallthru] ... [case else] [code block] [[break [label]]|fallthru] end switch The above example could be written with ##if## statements like this .. object temp = expression object breaking = false if equal(temp, val1) then [code block 1] [breaking = true] end if if not breaking and equal(temp, val2) then [code block 2] [breaking = true] end if if not breaking and equal(temp, val3) then [code block 3] [breaking = true] end if ... if not breaking then [code block 4] [breaking = true] end if The in a ##case## must be either an atom, literal string, constant or enum. Multiple values for a single ##case## can be specified by separating the values by commas. The same symbol (or literal) may not be used multiple times as a ##case## for the same ##switch##. If two different symbols used as ##case## values happen to have the same value, they must be in the same ##case...then## statement, or an error will occur. If the parser can determine all values when the ##switch## is parsed, then a compile time error will be thrown. Otherwise, the error will occur the first time that the switch is encountered. Likewise, when translating code, if the parser cannot determine all values at the time when the ##case## values are parsed, the compilation will fail due to mulitple ##case## values in the emitted C code (it is assumed that the programmer should work out this sort of bug in interpreted mode). By default, control flows to the end of the ##switch## block when the next ##case## is encountered. The default behavior can be modified in two ways. The default for a particular ##switch## block can be changed so that control passes to the next executable statement whenever a new case is encountered by using ##with fallthru## in the ##switch## statement: switch x with fallthru do case 1 then ? 1 case 2 then ? 2 break case else ? 0 end switch Note that when ##with fallthru## is used, the ##break## statement can be used to jump out of the ##switch## block. The behavior of individual ##case##s can be changed by using the ##fallthru## statement: switch x do case 1 then ? 1 fallthru case 2 then ? 2 case else ? 0 end switch Note that the ##break## statement before ##case else## was omitted, because the equivalent action is taken automatically by default. switch length(x) do case 1 then -- do something fallthru case 2 then -- do something extra case 3 then -- do something usual case else -- do something else end switch The ##label "name"## is optional and if used it gives a name to the switch block. This name can be used in nested switch ##break## statements to break out of an enclosing switch rather than just the owning switch. \\ Example: switch opt label "LBLa" do case 1, 5, 8 then FuncA() case 4, 2, 7 then FuncB() switch alt label "LBLb" do case "X" then FuncC() break "LBLa" case "Y" then FuncD() case else FuncE() end switch FuncF() case 3 then FuncG() break case else FuncH() end switch FuncM() In the above, if opt is 2 and alt is "X" then it runs...\\ :: FuncB() FuncC() FuncM() But if opt is 2 and alt is "Y" then it runs ...\\ :: FuncB() FuncD() FuncF() FuncM() In other words, the ##break "LBLa"## skips to the end of the switch called "LBLa" rather than the switch called "LBLb". @[elsedef|] @[elsifdef|] === ifdef statement The ##ifdef## statement has a similar syntax to the ##if## statement. ifdef SOME_WORD then --... zero or more statements elsifdef SOME_OTHER_WORD then --... zero or more statements elsedef --... zero or more statements end ifdef Of course, the ##elsifdef## and ##elsedef## clauses are optional, just like ##elsif## and ##else## are option in an ##if## statement. The major differences between and ##if## and ##ifdef## statement are that ##ifdef## is executed at parse time not runtime, and ##ifdef## can only test for the existence of a defined word whereas ##if## can test any boolean expression. **Note** that since the ##ifdef## statement executes at parse time, run-time values cannot be checked, only words defined by the ##-D## command line switch, or by the ##with define## directive, or one of the special predefined words. The purpose of ##ifdef## is to allow you to change the way your program operates in a very efficient manner. Rather than testing for a specific condition repeatedly during the running of a program, ##ifdef## tests for it once during parsing and then generates the precise IL code to handle the condition. For example, assume you have some debugging code in your application that displays information to the screen. Normally you would not want to see this display so you set a condition so it only displays during a 'debug' session. The first example below shows how would could do this just using the ##if## statement, and the second example shows the same thing but using the ##idef## statement. -- Example 1. -- if find("-DEBUG", command_line()) then writefln("Debug x=[], y=[]", {x,y}) end if -- Example 1. -- ifdef DEBUG then writefln("Debug x=[], y=[]", {x,y}) end ifdef As you can see, they are almost identical. However, in the first example, everytime the program gets to this point in the code, it tests the command line for the -DEBUG switch before deciding to display the information or not. But in the second example, the existence of DEBUG is tested //once// at parse time, and if it exists then, Euphoria generates the IL code to do the display. Thus when the program is running then everytime it gets to this point in the code, it does **not** check that DEBUG exists, instead it already knows it does so it just does the display. If however, DEBUG did not exist at parse time, then the IL code for the display would simply be omitted, meaning that during the running of the program, when it gets to this point in the code, it does not recheck for DEBUG, instead it already knows it doesn't exist and the IL code to do the display also doesn't exist so nothing is displayed. This can be a much needed performance boost for a program. Euphoria predefines some words itself: ==== Euphoria Version Definitions * **EU4** - Major Euphoria Version * **EU4_2** - Major and Minor Euphoria Version * **EU4_2_0** - Major, Minor and Release Euphoria Version Euphoria is released with the common version scheme of Major, Minor and Release version identifiers in the form of major.minor.release. When 4.2.1 is released, ##EU4_2_1## will be defined and ##EU4## will still be defined, but ##EU4_2_0## will no longer be defined. When 4.3 is released, ##EU4_2## will no longer be defined, but ##EU4_3## will be defined. Finally, when 5.0 is released, ##EU4## will no longer be defined, but ##EU5## will be defined. ==== Platform Definitions * **CONSOLE** - Euphoria is being executed with the Console version of the interpreter (on windows, eui.exe, others are eui) * **GUI** - Platform is Windows and is being executed with the GUI version of the interpreter (euiw.exe) * **WINDOWS** - Platform is Windows (GUI or Console) * **LINUX** - Platform is Linux * **OSX** - Platform is Mac OS X * **FREEBSD** - Platform is FreeBSD * **OPENBSD** - Platform is OpenBSD * **NETBSD** - Platform is NetBSD * **BSD** - Platform is a BSD variant (FreeBSD, OpenBSD, NetBSD and OS X) * **UNIX** - Platform is any Unix ==== Architecture Definitions Chip architecture: * **X86** * **X86_64** * **ARM** Size of pointers and euphoria objects. This information can be derived from the chip architecture, but is provided for convenience. * **BITS32** * **BITS64** Size of long integers. On Windows, long integers are always 32 bits. On other platforms, long integers are the same size as pointers. This information can also be derived from a combination of other architecture and platform ifdefs, but is provided for convenience. * **LONG32** * **LONG64** ==== Application Definitions * **EUI** - Application is being interpreted by ##eui##. * **EUC** - Application is being translated by ##euc##. * **EUC_DLL** - Application is being translated by ##euc## into a //DLL// file. * **EUB** - Application is being converted to a bound program by ##eub##. * **EUB_SHROUD** - Application is being converted to a shrouded program by ##eub##. * **CONSOLE** - Application is being translated, or converted to a bound //console// program by ##euc## or ##eub##, respectively. * **GUI** - Application is being converted to a bound //Windows GUI// program by ##eub##. ==== Library Definitions * **DATA_EXECUTE** - Application will always get executable memory from ##allocate## even when the system has Data Execute Protection enabled for the Euphoria Interpreter. * **SAFE** - Enables safe runtime checks for operations for routines found in ##machine.e## and ##dll.e## * **UCSTYPE_DEBUG** - Found in ##include/std/ucstypes.e## * **CRASH** - Found in ##include/std/unittest.e## More examples -- file: myproj.ex puts(1, "Hello, I am ") ifdef EUC then puts(1, "a translated") end ifdef ifdef EUI then puts(1, "an interpreted") end ifdef ifdef EUB then puts(1, "a bound") end ifdef ifdef EUB_SHROUD then puts(1, ", shrouded") end ifdef puts(1, " program.\n") {{{ C:\myproj> eui myproj.ex Hello, I am an interpreted program. C:\myproj> euc -con myprog.ex ... translating ... ... compiling ... C:\myproj> myprog.exe Hello, I am a translated program. C:\myproj> bind myprog.ex ... C:\myproj> myprog.exe Hello, I am a bound program. C:\myproj> shroud myprog.ex ... C:\myproj> eub myprog.il Hello, I am a bound, shrouded program. }}} It is possible for one or more of the above definitions to be true at the same time. For instance, ##EUC## and ##EUC_DLL## will both be true when the source file has been translated to a DLL. If you wish to know if your source file is translated and not a DLL, then you can ifdef EUC and not EUC_DLL then -- translated to an application end ifdef ==== Using ifdef You can define your own words either in source: with define MY_WORD -- defines without define OTHER_WORD -- undefines or by command line: {{{ eui -D MY_WORD myprog.ex }}} This can handle many tasks such as change the behavior of your application when running on //Linux// vs. //Windows//, enable or disable debug style code or possibly work differently in demo/shareware applications vs. registered applications. You should surround code that is not portable with ##ifdef## like: ifdef WINDOWS then -- Windows specific code. elsedef include std/error.e crash("This program must be run with the Windows interpreter.") end ifdef When writing **include files** that you cannot run on some platform, issue a crash call in the **include file**. **Yet** make sure that public constants and procedures are defined for the unsupported platform as well. ifdef UNIX then include std/bash.e end ifdef -- define exported and public constants and procedures for -- OSX as well ifdef WINDOWS or OSX then -- OSX is not supported but we define public symbols for it anyhow. The reason for doing this is so that the user that includes your include file sees an "OS not supported" message instead of an "undefined reference" message. Defined words must follow the same character set of an identifier, that is, it must start with either a letter or underscore and contain any mixture of letters, numbers and underscores. It is common for defined words to be in all upper case, however, it is not required. A few examples: for a = 1 to length(lines) do ifdef DEBUG then printf(1, "Line %i is %i characters long\n", {a, length(lines[a])}) end ifdef end for sequence os_name ifdef UNIX then include unix_ftp.e elsifdef WINDOWS then include win32_ftp.e elsedef crash("Operating system is not supported") end ifdef ifdef SHAREWARE then if record_count > 100 then message("Shareware version can only contain 100 records. Please register") abort(1) end if end ifdef The ##ifdef## statement is very efficient in that it makes the decision only once during parse time and only emits the ##TRUE## portions of code to the resulting interpreter. Thus, in loops that are iterated many times there is zero performance hit when making the decision. Example: while 1 do ifdef DEBUG then puts(1, "Hello, I am a debug message\n") end ifdef -- more code end while If ##DEBUG## is defined, then the interpreter/translator actually sees the code as being: while 1 do puts(1, "Hello, I am a debug message\n") -- more code end while Now, if ##DEBUG## is not defined, then the code the interpreter/translator sees is: while 1 do -- more code end while Do be careful to put the numbers after the platform names for //Windows//: -- This puts() routine will never be called -- even when run by the Windows interpreter! ifdef WINDOWS then puts(1,"I am on Windows\n") end ifdef %%output=lang_loop == Loop statements :<> An iterative code block repeats its own execution zero, one or more times. There are several ways to specify for how long the process should go on, and how to stop or otherwise alter it. An iterative block may be informally called a loop, and each execution of code in a loop is called an iteration of the loop. Euphoria has three flavors of loops. They all may harbor a [[:Header Labels]], in order to make exiting or resuming them more flexible. === while statement A **while statement** tests a condition to see if it is non-zero (true), and if so, a body of statements is executed. The condition is re-tested after when the statements are run, and if still true the statements are run again, and so on. Syntax Format: >##**while** //expr// //[//**with entry**//]// //[//**label** //"name"// //]// **do**## >>##//statements//## >##//[//**entry**//]//## >>##//statements//## >##**end while**## Example 1 while x > 0 do a = a * 2 x = x - 1 end while Example 2 while sequence(Line) with entry do proc(Line) entry Line = gets(handle) end while Example 3 while true label "main" do res = funcA() if res > 5 then if funcB() > some_value then continue "main" -- go to start of loop end if procC() end if procD(res) for i = 1 to res do if i > some_value then exit "main" -- exit the "main" loop, not just this 'for' loop. end if procF(i,res) end if res = funcE(res, some_value) end while === loop until statement A **loop** statement tests a condition to see if it is non-zero (true), and until it is true a loop is executed. Syntax Format: >##**loop** //[//**with entry**//]// //[//**label** //"name"// //]// **do**## >>##//statements//## >>##**until** //expr//## >##end loop## loop do a = a * 2 x = x - 1 until x<=0 end loop loop with entry do a = a * 2 entry x = x - 1 until x<=0 end loop loop label "GONEXT" do a = a * 2 y += 1 if y = 7 then continue "GONEXT" end if x = x - 1 until x<=0 end loop A ##while## statement differs from a ##loop## statement because the body of a loop is executed at least once, since testing takes place **after** the body completes. However in a ##while## statement, the test is taken **before** the body is executed. @[to|] @[by|] === for statement Syntax Format: >##**for** **loopvar** = **startexpr** to **endexpr** //[//**by delta**//]// **do**## >>##//statements//## >##end for## A **for** statement sets up a special loop that has its own **loop variable**. The **loop variable** starts with the specified initial value and increments or decrements it to the specified final value. The **for** statement is used when you need to repeat a set of statements a specific number of times.\\ Example: -- Display the numbers 1 to 6 on the screen. puts(1, "1\n") puts(1, "2\n") puts(1, "3\n") puts(1, "4\n") puts(1, "5\n") puts(1, "6\n") This block of code simply starts at the first line and runs each in turn. But it could be written more simply and flexibly by using a **for** statement. for i = 1 to 6 do printf(1, "%d\n", i) end for Now it's just three lines of code rather than six. More importantly, if we needed to change the program to print the numbers from 1 to 100, we only have to change one line rather than add 94 new lines. for i = 1 to 100 do -- One line change. printf(1, "%d\n", i) end for Or using another way ... for i = 1 to 10 do ? i -- ? is a short form for print() end for -- fractional numbers allowed too for i = 10.0 to 20.5 by 0.3 do for j = 20 to 10 by -2 do -- counting down ? {i, j} end for end for However, adding together floating point numbers that are not the ratio of an integer by a power of 2 ~--// 0.3 is not such a ratio//~--leads to some "fuzz" in the value of the index. In some cases, you might get unexpected results because of this fuzz, which arises from a common hardware limitation. For instance, ##floor(10*0.1)## is ##1## as expected, but ##floor(0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1)## is ##0##. The **loop variable** is declared automatically and exists until the end of the loop. Outside of the loop the variable has no value and is not even declared. If you need its final value, copy it into another variable before leaving the loop. The compiler will not allow any assignments to a loop variable. The initial value, loop limit and increment must all be atoms. If no increment is specified then +1 is assumed. The limit and increment values are established only on entering the loop, and are not affected by anything that happens during the execution of the loop. %%output=lang_flow == Flow control statements :<> Program execution flow refers to the order in which program statements are run in. By default, the next statement to run after the current one is the next statement //physically// located after the current one.\\ Example: a = b + c printf(1, "The result of adding %d and %d is %d", {b,c,a}) In that example, ##b## is added to ##c##, assigning the result to ##a##, and then the information is displayed on the screen using the ##printf## statement. However, there are many times in which the order of execution needs to be different from the default order, to get the job done. Euphoria has a number of //flow control statements// that you can use to arrange the execution order of statements. A set of statements that are run in their order of appearance is called a //block//. Blocks are good ways to organize code in easily identifiable chunks. However it can be desirable to leave a block before reaching the end, or slightly alter the default course of execution.\\ The following flow control keywords are available. break retry entry exit continue return goto end === exit statement Exiting a loop is done with the keyword **exit**. This causes flow to immediately leave the current loop and recommence with the first statement after the end of the loop. for i = a to b do c = i if doSomething(i) = 0 then exit -- Stop executing code inside the 'for' block. end if end for -- Flow restarts here. if c = a then ... But sometimes you need to leave a block that encloses the current one. Euphoria has two ways available for you to do this. The safest way, in terms of future maintenance, is to name the block you want to exit from and use that name on the exit statement. The other way is to use a number on the exit statement that refers to the depth that you want to exit from. A block's name is always a string literal and only a string literal. You cannot use a variable that contains the block's name on an exit statement. The name comes after the ##label## keyword, just before the ##do## keyword.\\ Example: integer b b = 0 for i = 1 to 20 label "main" do for j = 1 to 20 do b += i + j ? {i, j, b} if b > 50 then b = 0 exit "main" end if end for end for ? b The output from this is ... {1, 1, 2} {1, 2, 5} {1, 3, 9} {1, 4, 14} {1, 5, 20} {1, 6, 27} {1, 7, 35} {1, 8, 44} {1, 9, 54} 0 The **exit "main"** causes execution flow to leave the **for** block named //main//. The same thing could be achieved using the **exit N** format... integer b b = 0 for i = 1 to 20 do for j = 1 to 20 do b += i + j ? {i, j, b} if b > 50 then b = 0 exit 2 -- exit 2 levels of depth end if end for end for ? b But using this way means you have to take more care when changing the program so that if you change the depth, you also need to change the //exit// statement. ;Note~: :A special form of **exit N** is ##exit 0##. This leaves all levels of loop, regardless of the depth. Control continues after the outermost loop block. Likewise, ##exit -1## exits the second outermost loop, and so on. For easier and safer program maintenance, the explicit label form is to be preferred. Other forms are variously sensitive to changes in the program organization. Yet, they may prove more convenient in short, short lived programs, and are provided mostly for this purpose. For information on how to associate a string to a block of code, see the section [[:Header Labels]]. An **exit** without any label or number in a [[:while statement]] or a [[:for statement]] causes immediate termination of that loop, with control passing to the first statement after the loop.\\ Example: for i = 1 to 100 do if a[i] = x then location = i exit end if end for It is also quite common to see something like this: constant TRUE = 1 while TRUE do ... if some_condition then exit end if ... end while i.e. an "infinite" while-loop that actually terminates via an **exit statement** at some arbitrary point in the body of the loop. ;**Performance Note~:** :Euphoria optimizes this type of loop. At run-time, no test is performed at the top of the loop. There's just a simple unconditional jump from **end while** back to the first statement inside the loop. === break statement Works exactly like the **exit statement**, but applies to **if statements** or **switch statements** rather than to loop statements of any kind. Example: if s[1] = 'E' then a = 3 if s[2] = 'u' then b = 1 if s[3] = 'p' then break 0 -- leave topmost if block end if a = 2 else b = 4 end if else a = 0 b = 0 end if This code results in: * "Dur" -> a=0 b=0 * "Exe" -> a=3 b=4 * "Eux" -> a=2 b=1 * "Eup" -> a=3 b=1 The same optional parameters can be used with the **break** statement as with the **exit** statement, but of course apply to if and switch blocks only, instead of loops. === continue statement Likewise, skipping the rest of an iteration in a single code block is done using a single keyword, **continue**. The **continue statement** continues execution of the loop it applies to by going to the next iteration now. Going to the next iteration means testing a condition (for **while** and **loop** constructs, or changing the **for** construct variable index and checking whether it is still within bounds. for i = 3 to 6 do ? i if i = 4 then puts(1,"(2)\n") continue end if ? i * i end for This will print 3, 9, 4, (2), 5 25, 6 36. integer b b = 0 for i = 1 to 20 label "main" do for j = 1 to 20 do b += i + j if b > 50 then printf(1, "%d ", b) b = 0 continue "main" end if end for end for ? b The same optional parameters that can be used in an **exit** statement can apply to a **continue** statement. === retry statement The **retry statement** retries executing the current iteration of the loop it applies to. The statement branches to the first statement of the designated loop, without testing anything nor incrementing the for loop index. Normally, a sub-block which contains a **retry statement** also contains another flow control keyword, since otherwise the iteration would be endlessly executed. errors = 0 for i = 1 to length(files_to_open) do fh = open(files_to_open[i], "rb") if fh=-1 then if errors > 5 then exit else errors += 1 retry end if end if file_handles[i] = fh end for Since **retry** does not change the value of i and tries again opening the same file, there has to be a way to break from the loop, which the **exit statement** provides. The same optional parameters that can be used in an **exit** statement can apply to a **retry** statement. @[entry|] === with entry statement It is often the case that the first iteration of a loop is somehow special. Some things have to be done before the loop starts~--they are done before the statement starting the loop. Now, the problem is that, just as often, some things do not need to, or should not, be done at this initialization stage. The **entry keyword** is an alternative to setting flags relentlessly and forgetting to update them. Just add the **entry** keyword at the point you wish the first iteration starts. public function find_all(object x, sequence source, integer from) sequence ret = {} while from > 0 with entry do ret &= from from += 1 entry from = find_from(x, source, from) end while return ret end function Instead of performing an initial test, which may crash because from has not been assigned a value yet, the first iteration jumps at the point where from is being computed. The following iterations are normal. To emphasize the fact that the first iteration is not normal, the entry clause must be added to the loop header, after the condition. The entry statement is not supported for ##for## loops, because they have a more rigid nature structure than while or loop constructs. ; Note on infinite loops. : With **eui.exe** or **eui**, control-c will always stop your program immediately, but with the ##euiw.exe## that has not produced any console output, you will have to use the //Windows// process monitor to end the application. === goto statement ##goto## instructs the computer to resume code execution at a place which does not follow the statement. The place to resume execution is called the //target// of the statement. It is restricted to lie in the current routine, or the current file if outside any routine. Syntax is: goto "label string" The target of a ##goto## statement can be any accessible ##label## statement: label "label string" Label names must be double quoted constant strings. Characters that would be illegal in an Euphoria identifier may appear in a label name, since it is a regular string. [[:Header Labels]] do not count as possible goto targets. Use ##goto## in production code when all the following applies: * you want to proceed with a statement which is not the following one; * the various structured constructs wouldn't do, or very awkwardly; * you contemplate a significant gain in speed/reliability from such a direct move; * the code flow remains understandable for an outsider nevertheless. During early development, it may be nice to have while the code is not firmly structured. But most instances of ##goto## should melt into structured constructs as soon as possible as code matures. You may find out that modifying a program that has goto statements is usually trickier than if it had not had them. The following may be situations where ##goto## can help: * A routine has several return statements, and some processing must be done before returning, no matter from where. It may be clearer to goto a single return point and perform the processing only at this point. * An exit statement in a loop corresponds to an early exit, and the normal processing that immediately follows the loop is not relevant. Replacing an exit statement followed by various flag testing by a single goto can help. Explicit label names will tremendously help maintenance. Remember that there is no limit to their contents. goto-ing into a scope (like an if block, a for loop,...) will just do that. Some variables may be defined only in that scope, and they may or may not have sensible values. It is up to the programmer to take appropriate action in this respect. === Header Labels === As shown in the above section on control flow statements, most can have their own label. To label a flow control statement, use a ##label## clause immediately preceding the flow control's terminator keyword (##then## / ##do##). A ##label## clause consists of the keyword **##label##** followed by a string literal. The string is the label name. Examples: if n=0 label "an_if_block" then ... end if while TRUE label "a_while_block" do ... end while loop label "a_loop_block" do ... until TRUE end loop switch x label "a_switch_block" do ... end switch **Note**: If a flow control statement has both an ##entry## clause and a ##label## clause, the ##entry## clause must come before the ##label## clause: while 1 label "top" with entry do -- WRONG while 1 with entry label "top" do -- CORRECT %%output=lang_short_circuit == Short-Circuit Evaluation == @[short_circuit|] :<> When the condition tested by if, elsif, until, or while contains ##and## or ##or## operators, [[:short_circuit]] evaluation will be used. For example, if a < 0 and b > 0 then ... If a < 0 is false, then Euphoria will not bother to test if b is greater than 0. It will know that the overall result is false regardless. Similarly, if a < 0 or b > 0 then ... if a < 0 is true, then Euphoria will immediately decide that the result is true, without testing the value of b, since the result of this test would be irrelevant. In general, whenever we have a condition of the form: A and B where A and B can be any two expressions, Euphoria will take a short-cut when A is false and immediately make the overall result false, without even looking at expression B. Similarly, with: A or B when A is true, Euphoria will skip the evaluation of expression B, and declare the result to be true. If the expression B contains a call to a function, and that function has possible **side-effects**, i.e. it might do more than just return a value, you will get a compile-time warning. Older versions (pre-2.1) of Euphoria did not use [[:short_circuit]] evaluation, and it's possible that some old code will no longer work correctly, although a search of the Euphoria archives did not turn up any programs that depend on side-effects in this way, but other Euphoria code might do so. The expression, B, could contain something that would normally cause a run-time error. If Euphoria skips the evaluation of B, the error will not be discovered. For instance: if x != 0 and 1/x > 10 then -- divide by zero error avoided while 1 or {1,2,3,4,5} do -- illegal sequence result avoided B could even contain uninitialized variables, out-of-bounds subscripts etc. This may look like sloppy coding, but in fact it often allows you to write something in a simpler and more readable way. For instance: if length(x) > 1 and x[2] = y then Without short-circuiting, you would have a problem when x contains less than 2 items. With short-circuiting, the assignment to x[2] will only be done when x has at least 2 items. Similarly: -- find 'a' or 'A' in s i = 1 while i <= length(s) and s[i] != 'a' and s[i] != 'A' do i += 1 end while In this loop the variable i might eventually become greater than length(s). Without short-circuit evaluation, a subscript out-of-bounds error will occur when s[i] is evaluated on the final iteration. With short-circuiting, the loop will terminate immediately when i <= length(s) becomes false. Euphoria will not evaluate s[i] != 'a' and will not evaluate s[i] != 'A'. No subscript error will occur. **Short-circuit** evaluation of ##and## and ##or## takes place inside decision making expressions. These are found in the [[:if statement]], [[:while statement]] and the [[:loop until statement]]. It is not used in other contexts. For example, the assignment statement: x = 1 or {1,2,3,4,5} -- x should be set to {1,1,1,1,1} If short-circuiting were used here, we would set x to 1, and not even look at {1,2,3,4,5}. This would be wrong. Short-circuiting can be used in if/elsif/until/while conditions because we only care if the result is true or false, and conditions are required to produce an atom as a result. %%output=lang_toplevel == Special Top-Level Statements == @[specialstatements|] :<> Before any of your statements are executed, the Euphoria front-end quickly reads your entire program. All statements are syntax checked and converted to a low-level intermediate language (IL). The interpreter immediately executes the IL after it is completely generated. The translator converts the IL to C. The binder/shrouder saves the IL on disk for later execution. These three tools all share the same front-end (written in Euphoria). If your program contains only routine and variable declarations, but no top-level executable statements, then nothing will happen when you run it (other than syntax checking). You need a top-level statement to call your main routine (see [[:Example Programs]]). It's quite possible to have a program with nothing but top-level executable statements and no routines. For example you might want to use Euphoria as a simple calculator, typing just a few [[:print]] or [[:? -> q_print]] statements into a file, and then executing it. As we have seen, you can use any Euphoria statement, including [[:for statement]], [[:while statement]], [[:if statement]], etc... (but not [[:return statement|return]]), at the top level i.e. //outside// of any [[:function ->functions]] or [[:procedure ->procedures]]. In addition, the following special statements may //only// appear at the top level: * ##include## * ##with## / ##without## === include statement When you write a large program it is often helpful to break it up into logically separate files, by using **include statements**. Sometimes you will want to reuse some code that you have previously written, or that someone else has written. Rather than copy this code into your main program, you can use an **include statement** to refer to the file containing the code. The first form of the include statement is: ; ##include //filename//## : This reads in (compiles) a Euphoria source file. Some Examples: include std/graphics.e include /mylib/myroutines.e public include library.e Any top-level code in the included file will be executed at start up time. Any ##global## identifiers that are declared in the file doing the including will also be visible in the file being included. However the situation is slightly different for an identifier declared as **public** or **export**. In these cases the file being included will **not** see ##public/export## symbols declared in the file doing the including, unless the file being included also explicitly includes the file doing the including. Yes, you would better read that again because its not that obvious. Here's an example... We have two files, a.e and b.e ... -- a.e -- ? c -- declared as global in 'b.e' -- b.e -- include a.e global integer c = 0 This will work because being ##global## the symbol 'c' in b.e can be seen by all files in this //include tree//. However ... -- a.e -- ? c -- declared as public in 'b.e' -- b.e -- include a.e public integer c = 0 Will not work as public symbols can only be seen when their declaring file is explicitly included. So to get this to work you need to write a.e as ... -- a.e -- include b.e ? c -- declared as public in 'b.e' ---- **N.B.** Only those symbols declared as ##global## in the included file will be visible (accessible) in the remainder of the including file. Their visibility in other included files or in the main program file depends on other factors. Specifically, a global symbols can only be accessed by files in the same //include tree//. For example... If we have danny.e declare a global symbol called 'foo', and bob.e includes danny.e, then code in bob.e can access danny's 'foo'. Now if we also have cathy.e declare a global symbol called 'foo', and anne.e includes cathy.e, then code in ann.e can access cathy's 'foo'. Nothing unusual about that situation. Now, if we have a program that includes both bob.e and anne.e, the code in bob.e and anne.e should still work even though there are now two global 'foo' symbols available. This is because the include tree for bob.e //only// contains danny.e and likewise the include tree for anne.e //only// contains cathy.e. So as the two 'foo' symbols are in separate include trees (from bob.e and anne.e perspective) code in those files continues to work correctly. A problem can occur if the main program (the one that includes both bob.e and anne.e) references 'foo'. In order for Euphoria to know which one the code author meant to use, the coder must use the namespace facility. --- mainprog.ex --- include anne.e as anne include bob.e as bob anne:foo() -- Specify the 'foo' from anne.e. If the above code did not use namespaces, Euphoria would not have know which 'foo' to use ~-- the one from bob.e or the one in anne.e. If public precedes the include statement, then all public identifiers from the included file will also be visible to the including file, and visible to any file that includes the current file. If an absolute //filename// is given, Euphoria will open it and start parsing it. When a relative //filename// is given, Euphoria will try to open the file relative to the following directories, in the following order: # The directory containing the current source file. i.e. the source file that contains the include statement that is being processed. # The directory containing the main file given on the interpreter, translator or binder ~-- see [[:command_line]]. # If you've defined an environment variable named ##EUINC##, Euphoria will check each directory listed in ##EUINC## (from left to right). ##EUINC## should be a list of directories, separated by semicolons (colons on //Linux// / //FreeBSD//), similar in form to your ##PATH## variable. ##EUINC## can be added to your set of //Linux// / //FreeBSD// or //Windows// environment variables. (Via ##Control Panel / Performance & Maintenance / System / Advanced## on //XP//, or ##AUTOEXEC.BAT## on older versions of //Windows//). e.g. ##SET EUINC=C:\EU\MYFILES;C:\EU\WINDOWSLIB## ##EUINC## lets you organize your include files according to application areas, and avoid adding numerous unrelated files to ##euphoria\include##. # Finally, if it still hasn't found the file, it will look in ##euphoria\include##. This directory contains the standard Euphoria include files. The environment variable ##EUDIR## tells Euphoria where to find your ##euphoria## directory. An included file can include other files. In fact, you can "nest" included files up to 30 levels deep. Include file names typically end in ##.e##, or sometimes ##.ew## or ##.eu## (when they are intended for use with //Windows// or //Unix//). This is just a convention. It is not required. If your filename (or path) contains blanks or escape-able characters , you must enclose it in double-quotes, otherwise quotes are optional. When a filename is enclosed in double-quotes, you can also use the standard escape character notation to specify filenames that have non-ASCII characters in them. Note that under Windows, you can also use the forward slash '/' instead of the usually back-slash '\'. By doing this, the file paths are compatible with //Unix// systems and it means you don't have to 'escape' the back-slashes. \\ For example: include "c:/program files/myfile.e" Other than possibly defining a new namespace identifier (see below), an include statement will be quietly ignored if the same file has already been included. An include statement must be written on a line by itself. Only a comment can appear after it on the same line. @[as|] The second form of the include statement is: ; ##include** **//filename// as //namespace_identifier//##: : This is just like the simple include, but it also defines a //namespace identifier// that can be attached to global identifiers in the included file that you want to refer to in the main file. This might be necessary to disambiguate references to those identifiers, or you might feel that it makes your code more readable. This ##as identifier## namespace exists in the current file, along with any ##namespace identifier## the included file may define. > See Also: [[:Using namespaces]]. < === with / without These special statements affect the way that Euphoria translates your program into internal form. Options to the ##with## and ##without## statement come in two flavors. One simply turns an option on or off, while the others have multiple states. ==== On / Off options || Default || Option || | without | [[:Profiling "profile"]] | | without | [[:Profiling "profile_time"]] | | without | [[:trace]] | | without | [[:with_batch "batch"]] | | with | [[:type_check]] | | with | [[:indirect_includes]] | | with | [[:with_inline "inline"]] | ##with## turns **on** one of the options and ##without## turns **off** one of the options. For more information on the ##profile##, ##profile_time## and ##trace## options, see [[:Debugging and Profiling]]. For more information on the ##type_check## option, see [[:Performance Tips]]. There is also a rarely-used special ##with## option where a code number appears after ##with##. In previous releases this code was used by RDS to make a file exempt from adding to the statement count in the old "Public Domain" Edition. This is not used any longer, but does not cause an error. You can select any combination of settings, and you can change the settings, but the changes must occur //between// subroutines, not within a subroutine. The only exception is that you can only turn on one type of profiling for a given run of your program. An **included file** inherits the **with/without** settings in effect at the point where it is included. An included file can change these settings, but they will revert back to their original state at the end of the included file. For instance, an included file might turn off warnings for itself and (initially) for any files that it includes, but this will not turn off warnings for the main file. **@[indirect_includes]**, This ##with/without## option changes the way in which global symbols are resolved. Normally, the parser uses the way that files were included to resolve a usage of a global symbol. If ##without indirect_includes## is in effect, then only direct includes are considered when resolving global symbols. This option is especially useful when a program uses some code that was developed for a prior version of Euphoria that uses the pre-4.0 standard library, when all exposed symbols were global. These can often clash with symbols in the new standard library. Using ##without indirect_includes## would not force a coder to use namespaces to resolve symbols that clashed with the new standard library. Note that this setting does not propagate down to included files, unlike most ##with/without options##. Each file begins with ##indirect_includes## turned on. **@[with_batch|with batch]**, Causes the program to not present the "Press Enter" prompt if an error occurs. The exit code will still be set to 1 on error. This is helpful for programs that run in a mode where no human may be directly interacting with it. For example, a CGI application or a CRON job. You can also set this option via a [[:batch_command_line "command line parameter"]]. ==== Complex with / without options ===== with / without warning Any warnings that are issued will appear on your screen after your program has finished execution. Warnings indicate minor problems. A warning will never terminate the execution of your program. You will simply have to hit the Enter key to keep going ~-- which may stop the program on an unattended computer. The forms available are ... ; ##with warning## : enables all warnings ; ##without warning## : disables all warnings ; ##with warning {//warning name list//}\\ with warning = {//warning name list//}## : enables only these warnings, and disables all other ; ##without warning {//warning name list//}\\ without warning = {//warning name list//}## : enables all warnings except the warnings listed ; ##with warning &= {//warning name list//}\\ with warning += {//warning name list//}## : enables listed warnings in addition to whichever are enabled already ; ##without warning &= {//warning name list//}\\ without warning += {//warning name list//}## : disables listed warnings and leaves any not listed in its current state. ; ##with warning save## : saves the current warning state, i.e. the list of all enabled warnings. This destroys any previously saved state. ; ##with warning restore## : causes the previously saved state to be restored. ; ##without warning strict## : overrides some of the warnings that the -STRICT command line option tests for, but only until the end of the next function or procedure. The warnings overridden are * default_arg_type * not_used * short_circuit * not_reached * empty_case * no_case_else The **with/without warnings** directives will have no effect if the ##-STRICT## command line switch is used. The latter turns on all warnings and ignores any **with/without warnings** statement. It also warns if a parameter of a routine is unused. However, it can be temporarily affected by the "##without warning strict##" directive. ---- **Warning Names** ---- |= Name |= Meaning | ##none## | When used with the ##with## option, this turns off all warnings. When used with the ##without## option, this turns on all warnings. | ##resolution## | an identifier was used in a file, but was defined in a file this file doesn't (recursively) include. | ##short_circuit## | a routine call that **could affect the state of your program** may not take place because of [[:short_circuit "short circuit"]] evaluation in a conditional clause. | ##override## | a built-in is being overridden | ##builtin_chosen## | an unqualified call caused Euphoria to choose between a built-in and another global which does not override it. Euphoria chooses the built-in. | ##not_used## | A variable has not been used and is going out of scope. <<<<<<< HEAD | ##no_value## | A variable never got assigned a value and is going out of scope. | ##custom## | Any warning that was defined using the ##warning## procedure. ======= | ##no_value## | A variable is used but *never* gets assigned a value. | ##custom## | Any warning that was defined using the warning() procedure. >>>>>>> origin/4.0 | ##not_reached## | After a keyword that branches unconditionally, the only thing that should appear is an end of block keyword, or possibly a label that a goto statement can target. Otherwise, there is no way that the statement can be reached at all. This warning notifies this condition. | ##translator## | An option was given to the translator, but this option is not recognized as valid for the C compiler being used. | ##cmdline## | A command line option was not recognized. | ##mixed_profile## | For technical reasons, it is not possible to use both ##with profile## and ##with profile_time## in the same section of code. The profile statement read last is ignored, and this warning is issued. | ##empty_case## | In ##switch## that have ##without fallthru##, an empty case block will result in no code being executed within the switch statement. | ##default_case## | A ##switch## that does not have a ##case else## clause. | ##default_arg_type## | Reserved (not in use yet) | ##deprecated## | Reserved (not in use yet) | ##all## | Turns all warnings on. They can still be disabled by with/without warning directives. **Example** with warning save without warning &= (builtin_chosen, not_used) . . . -- some code that might otherwise issue warnings with warning restore Initially, only the following warnings are enabled: * ##builtin_chosen## * ##cmdline## * ##custom## * ##mixed_profile## * ##not_reached## * ##override## * ##resolution## * ##translator## This set can be changed using -W or -X command line switches. @[with_define|] ===== with / without define As mentioned about [[:ifdef statement]], this top level statement is used to define/undefine tags which the ifdef statement may use. The following tags have a predefined meaning in Euphoria: * WINDOWS: platform is any version of Windows (tm) from '95 on to Vista and beyond * WINDOWS: platform is any kind of Windows system * UNIX: platform is any kind of Unix style system * LINUX: platform is Linux * FREEBSD: platform is FreeBSD * OSX: platform is OS X for Macintosh * SAFE: turns on a slower debugging version of ##memory.e## called ##safe.e## when defined. Switching mode by renaming files **//no longer works//**. * EU4: defined on all versions of the version 4 interpreter * EU4_0: defined on all versions of the interpreter from 4.0.0 to 4.0.X * EU4_0_0: defined only for version 4.0.0 of the interpreter The name of a tag may contain any character that is a valid identifier character, that is ##A-Za-z0-9_##. It is not required, but by convention defined words are upper case. @[with_inline|] ==== with / without inline This directive allows coders some flexibility with inlined routines. The default is for inlining to be on. Any routine that is defined when ##without inline## is in effect will never be inlined. ##with inline## takes an optional integer parameter that defines the largest routine (by size of IL code) that will be considered for inlining. The default is 30.