CREATE TYPE

Name

CREATE TYPE -- 定义一种新的数据类型

Synopsis

CREATE TYPE name AS
    ( [ attribute_name data_type [ COLLATE collation ] [, ... ] ] )

CREATE TYPE name AS ENUM
    ( [ 'label' [, ... ] ] )

CREATE TYPE name AS RANGE (
    SUBTYPE = subtype
    [ , SUBTYPE_OPCLASS = subtype_operator_class ]
    [ , COLLATION = collation ]
    [ , CANONICAL = canonical_function ]
    [ , SUBTYPE_DIFF = subtype_diff_function ]
)

CREATE TYPE name (
    INPUT = input_function,
    OUTPUT = output_function
    [ , RECEIVE = receive_function ]
    [ , SEND = send_function ]
    [ , TYPMOD_IN = type_modifier_input_function ]
    [ , TYPMOD_OUT = type_modifier_output_function ]
    [ , ANALYZE = analyze_function ]
    [ , INTERNALLENGTH = { internallength | VARIABLE } ]
    [ , PASSEDBYVALUE ]
    [ , ALIGNMENT = alignment ]
    [ , STORAGE = storage ]
    [ , LIKE = like_type ]
    [ , CATEGORY = category ]
    [ , PREFERRED = preferred ]
    [ , DEFAULT = default ]
    [ , ELEMENT = element ]
    [ , DELIMITER = delimiter ]
    [ , COLLATABLE = collatable ]
)

CREATE TYPE name

描述

CREATE TYPE在当前数据库中注册一种新的 数据类型。定义数据类型的用户将成为它的拥有者。

如果给定一个模式名,那么该类型将被创建在指定的模式中。否则它会被 创建在当前模式中。类型名称必须与同一个模式中任何现有的类型或者域 相区别(因为表具有相关的数据类型,类型名称也必须与同一个模式中任 何现有表的名字不同)。

如上面的语法所示,有五种形式的CREATE TYPE。 它们分别创建组合类型枚举类型范围类型基础类型或者 shell 类型。下文将依次讨论前四种形式。shell 类型仅仅 是一种用于后面要定义的类型的占位符,通过发出一个不带除类型名之外其 他参数的CREATE TYPE命令可以创建这种类型。 在创建范围类型和基础类型时,需要 shell 类型作为一种向前引用。

组合类型

第一种形式的CREATE TYPE创建组合类型。 组合类型由一个属性名和数据类型的列表指定。如果属性的数据类型是可 排序的,也可以指定该属性的排序规则。组合类型本质上和表的行类型相 同,但是如果只想定义一种类型,使用 CREATE TYPE避免了创建一个实际的表。单 独的组合类型也是很有用的,例如可以作为函数的参数或者返回类型。

为了能够创建组合类型,必须拥有在其所有属性类型上的 USAGE特权。

枚举类型

Section 8.7中所述,第二种形式的 CREATE TYPE创建枚举类型。枚举类型需要 由一个或者更多带引号的标签构成的列表,每一个标签长度必须不超过 NAMEDATALEN字节(在标准的 PostgreSQL编译中是 64 字节)。

范围类型

Section 8.17中所述,第三种形式的 CREATE TYPE创建范围类型。

范围类型的subtype可以 是任何带有一个相关的 B 树操作符类(用来决定该范围类型值的顺序)的类型。 通常,子类型的默认 B 树操作符类被用来决定顺序。要使用一种非默认操作符 类,可以用 subtype_opclass指定它 的名字。如果子类型是可排序的并且希望在该范围的顺序中使用一种非默认的 排序规则,可以用collation选项来指定。

可选的canonical函数 必须接受一个所定义的范围类型的参数,并且返回同样类型的一个值。在适 用时,它被用来把范围值转换成一种规范的形式。更多信息请见Section 8.17.8。创建一个 canonical函数有点棘 手,因为必须在声明范围类型之前定义它。要这样做,必须首先创建一种 shell 类型,它是一种没有属性只有名称和拥有者的占位符类型。这可以通过 发出不带额外参数的命令CREATE TYPE name来完成。然后可以使用该 shell 类型作为 参数和结果来声明该函数,并且最终用同样的名称来声明范围类型。这会自动 用一种合法的范围类型替换 shell 类型项。

可选的subtype_diff函数 必须接受两个subtype类型 的值作为参数,并且返回一个double precision值表示两个给定 值之间的差别。虽然这是可选的,但是提供这个函数会让该范围类型列上 GiST 索 引效率更高。详见Section 8.17.8

基础类型

第四种形式的CREATE TYPE创建一种新的 基本类型(标量类型)。为了创建一种新的基本类型,你必须是一个超级 用户(做这种限制的原因是一种错误的类型定义可能让服务器混淆甚至 崩溃)。

参数可以以任意顺序出现(而不仅是按照上面所示的顺序),并且大部分 是可选的。在定义类型前,必须注册两个或者更多函数(使用 CREATE FUNCTION)。支持函数 input_function以及 output_function 是必需的,而函数 receive_functionsend_functiontype_modifier_input_functiontype_modifier_output_functionanalyze_function 是可选的。通常来说这些函数必须是用 C 或者另外一种低层语言编写的。

The input_function converts the type's external textual representation to the internal representation used by the operators and functions defined for the type. output_function performs the reverse transformation. The input function can be declared as taking one argument of type cstring, or as taking three arguments of types cstring, oid, integer. The first argument is the input text as a C string, the second argument is the type's own OID (except for array types, which instead receive their element type's OID), and the third is the typmod of the destination column, if known (-1 will be passed if not). The input function must return a value of the data type itself. Usually, an input function should be declared STRICT; if it is not, it will be called with a NULL first parameter when reading a NULL input value. The function must still return NULL in this case, unless it raises an error. (This case is mainly meant to support domain input functions, which might need to reject NULL inputs.) The output function must be declared as taking one argument of the new data type. The output function must return type cstring. Output functions are not invoked for NULL values.

The optional receive_function converts the type's external binary representation to the internal representation. If this function is not supplied, the type cannot participate in binary input. The binary representation should be chosen to be cheap to convert to internal form, while being reasonably portable. (For example, the standard integer data types use network byte order as the external binary representation, while the internal representation is in the machine's native byte order.) The receive function should perform adequate checking to ensure that the value is valid. The receive function can be declared as taking one argument of type internal, or as taking three arguments of types internal, oid, integer. The first argument is a pointer to a StringInfo buffer holding the received byte string; the optional arguments are the same as for the text input function. The receive function must return a value of the data type itself. Usually, a receive function should be declared STRICT; if it is not, it will be called with a NULL first parameter when reading a NULL input value. The function must still return NULL in this case, unless it raises an error. (This case is mainly meant to support domain receive functions, which might need to reject NULL inputs.) Similarly, the optional send_function converts from the internal representation to the external binary representation. If this function is not supplied, the type cannot participate in binary output. The send function must be declared as taking one argument of the new data type. The send function must return type bytea. Send functions are not invoked for NULL values.

You should at this point be wondering how the input and output functions can be declared to have results or arguments of the new type, when they have to be created before the new type can be created. The answer is that the type should first be defined as a shell type, which is a placeholder type that has no properties except a name and an owner. This is done by issuing the command CREATE TYPE name, with no additional parameters. Then the I/O functions can be defined referencing the shell type. Finally, CREATE TYPE with a full definition replaces the shell entry with a complete, valid type definition, after which the new type can be used normally.

The optional type_modifier_input_function and type_modifier_output_function are needed if the type supports modifiers, that is optional constraints attached to a type declaration, such as char(5) or numeric(30,2). PostgreSQL allows user-defined types to take one or more simple constants or identifiers as modifiers. However, this information must be capable of being packed into a single non-negative integer value for storage in the system catalogs. The type_modifier_input_function is passed the declared modifier(s) in the form of a cstring array. It must check the values for validity (throwing an error if they are wrong), and if they are correct, return a single non-negative integer value that will be stored as the column "typmod". Type modifiers will be rejected if the type does not have a type_modifier_input_function. The type_modifier_output_function converts the internal integer typmod value back to the correct form for user display. It must return a cstring value that is the exact string to append to the type name; for example numeric's function might return (30,2). It is allowed to omit the type_modifier_output_function, in which case the default display format is just the stored typmod integer value enclosed in parentheses.

The optional analyze_function performs type-specific statistics collection for columns of the data type. By default, ANALYZE will attempt to gather statistics using the type's "equals" and "less-than" operators, if there is a default b-tree operator class for the type. For non-scalar types this behavior is likely to be unsuitable, so it can be overridden by specifying a custom analysis function. The analysis function must be declared to take a single argument of type internal, and return a boolean result. The detailed API for analysis functions appears in src/include/commands/vacuum.h.

While the details of the new type's internal representation are only known to the I/O functions and other functions you create to work with the type, there are several properties of the internal representation that must be declared to PostgreSQL. Foremost of these is internallength. Base data types can be fixed-length, in which case internallength is a positive integer, or variable length, indicated by setting internallength to VARIABLE. (Internally, this is represented by setting typlen to -1.) The internal representation of all variable-length types must start with a 4-byte integer giving the total length of this value of the type.

The optional flag PASSEDBYVALUE indicates that values of this data type are passed by value, rather than by reference. You cannot pass by value types whose internal representation is larger than the size of the Datum type (4 bytes on most machines, 8 bytes on a few).

The alignment parameter specifies the storage alignment required for the data type. The allowed values equate to alignment on 1, 2, 4, or 8 byte boundaries. Note that variable-length types must have an alignment of at least 4, since they necessarily contain an int4 as their first component.

The storage parameter allows selection of storage strategies for variable-length data types. (Only plain is allowed for fixed-length types.) plain specifies that data of the type will always be stored in-line and not compressed. extended specifies that the system will first try to compress a long data value, and will move the value out of the main table row if it's still too long. external allows the value to be moved out of the main table, but the system will not try to compress it. main allows compression, but discourages moving the value out of the main table. (Data items with this storage strategy might still be moved out of the main table if there is no other way to make a row fit, but they will be kept in the main table preferentially over extended and external items.)

The like_type parameter provides an alternative method for specifying the basic representation properties of a data type: copy them from some existing type. The values of internallength, passedbyvalue, alignment, and storage are copied from the named type. (It is possible, though usually undesirable, to override some of these values by specifying them along with the LIKE clause.) Specifying representation this way is especially useful when the low-level implementation of the new type "piggybacks" on an existing type in some fashion.

The category and preferred parameters can be used to help control which implicit cast will be applied in ambiguous situations. Each data type belongs to a category named by a single ASCII character, and each type is either "preferred" or not within its category. The parser will prefer casting to preferred types (but only from other types within the same category) when this rule is helpful in resolving overloaded functions or operators. For more details see Chapter 10. For types that have no implicit casts to or from any other types, it is sufficient to leave these settings at the defaults. However, for a group of related types that have implicit casts, it is often helpful to mark them all as belonging to a category and select one or two of the "most general" types as being preferred within the category. The category parameter is especially useful when adding a user-defined type to an existing built-in category, such as the numeric or string types. However, it is also possible to create new entirely-user-defined type categories. Select any ASCII character other than an upper-case letter to name such a category.

A default value can be specified, in case a user wants columns of the data type to default to something other than the null value. Specify the default with the DEFAULT key word. (Such a default can be overridden by an explicit DEFAULT clause attached to a particular column.)

To indicate that a type is an array, specify the type of the array elements using the ELEMENT key word. For example, to define an array of 4-byte integers (int4), specify ELEMENT = int4. More details about array types appear below.

To indicate the delimiter to be used between values in the external representation of arrays of this type, delimiter can be set to a specific character. The default delimiter is the comma (,). Note that the delimiter is associated with the array element type, not the array type itself.

If the optional Boolean parameter collatable is true, column definitions and expressions of the type may carry collation information through use of the COLLATE clause. It is up to the implementations of the functions operating on the type to actually make use of the collation information; this does not happen automatically merely by marking the type collatable.

数组类型

Whenever a user-defined type is created, PostgreSQL automatically creates an associated array type, whose name consists of the element type's name prepended with an underscore, and truncated if necessary to keep it less than NAMEDATALEN bytes long. (If the name so generated collides with an existing type name, the process is repeated until a non-colliding name is found.) This implicitly-created array type is variable length and uses the built-in input and output functions array_in and array_out. The array type tracks any changes in its element type's owner or schema, and is dropped if the element type is.

You might reasonably ask why there is an ELEMENT option, if the system makes the correct array type automatically. The only case where it's useful to use ELEMENT is when you are making a fixed-length type that happens to be internally an array of a number of identical things, and you want to allow these things to be accessed directly by subscripting, in addition to whatever operations you plan to provide for the type as a whole. For example, type point is represented as just two floating-point numbers, each can be accessed using point[0] and point[1]. Note that this facility only works for fixed-length types whose internal form is exactly a sequence of identical fixed-length fields. A subscriptable variable-length type must have the generalized internal representation used by array_in and array_out. For historical reasons (i.e., this is clearly wrong but it's far too late to change it), subscripting of fixed-length array types starts from zero, rather than from one as for variable-length arrays.

参数

name

要创建的类型的名称(可以是模式限定的)。

attribute_name

组合类型的一个属性(列)的名称。

data_type

要成为组合类型的一列的现有数据类型的名称。

collation

要与组合类型的一列或者范围类型相关联的现有排序规则的名称。

label

表示与枚举类型的一个值相关联的文本标签的字符串。

subtype

范围类型将表示的范围的元素类型的名称。

subtype_operator_class

subtype 的一个 B-树操作符类的名称。

canonical_function

用于范围类型的规范化函数的名称。

subtype_diff_function

用于 subtype 的差函数的名称。

input_function

将数据从类型的外部文本形式转换到其内部形式的函数名称。

output_function

将数据从类型的内部形式转换到其外部文本形式的函数名称。

receive_function

将数据从类型的外部二进制形式转换到其内部形式的函数名称。

send_function

将数据从类型的内部形式转换到其外部二进制形式的函数名称。

type_modifier_input_function

将该类型的修饰语数组转换为内部形式的函数名。

type_modifier_output_function

将该类型的修饰语的内部形式转换为外部文本形式的函数名。

analyze_function

为该数据类型执行统计分析的函数名。

internallength

一个数值常量,它指定新类型的内部表示的长度(字节数)。 默认假定它是变长的。

alignment

数据类型的存储对齐要求。如果指定,必须是 charint2int4或者double。默认是 int4

storage

数据类型的存储策略。如果指定,必须是 plainexternalextended或者main。默认是 plain

like_type

一种现有数据类型的名称,新类型将具有和它相同的表示。新类型 会从该类型中复制 internallengthpassedbyvaluealignment以及 storage的值, 不过在这个CREATE TYPE命令中对上述值的显式 指定会覆盖这些复制的值。

category

这种类型的分类码(一个 ASCII 字符)。默认是'U',表示 "用户定义类型"。其他标准分类码可以在 Table 48-53中找到。为了创建自定义 的分类,需要选择其他的 ASCII字符。

preferred

如果这种类型是其类型分类中的首选类型则为真,否则为假。 默认值为假。在一个现有的类型分类中创建一个新的首选类型 时要小心,因为这可能导致行为出现改变。

default

该数据类型的默认值。如果被忽略,默认值为空值。

element

被创建的类型是一个数组,这里指定的是数组元素的类型。

delimiter

用这种类型构造的数组中用来分隔值的定界符。

collatable

如果这种类型的操作可以使用排序规则信息,则为真。默认为假。

注解

Because there are no restrictions on use of a data type once it's been created, creating a base type or range type is tantamount to granting public execute permission on the functions mentioned in the type definition. This is usually not an issue for the sorts of functions that are useful in a type definition. But you might want to think twice before designing a type in a way that would require "secret" information to be used while converting it to or from external form.

Before PostgreSQL version 8.3, the name of a generated array type was always exactly the element type's name with one underscore character (_) prepended. (Type names were therefore restricted in length to one less character than other names.) While this is still usually the case, the array type name may vary from this in case of maximum-length names or collisions with user type names that begin with underscore. Writing code that depends on this convention is therefore deprecated. Instead, use pg_type.typarray to locate the array type associated with a given type.

It may be advisable to avoid using type and table names that begin with underscore. While the server will change generated array type names to avoid collisions with user-given names, there is still risk of confusion, particularly with old client software that may assume that type names beginning with underscores always represent arrays.

Before PostgreSQL version 8.2, the shell-type creation syntax CREATE TYPE name did not exist. The way to create a new base type was to create its input function first. In this approach, PostgreSQL will first see the name of the new data type as the return type of the input function. The shell type is implicitly created in this situation, and then it can be referenced in the definitions of the remaining I/O functions. This approach still works, but is deprecated and might be disallowed in some future release. Also, to avoid accidentally cluttering the catalogs with shell types as a result of simple typos in function definitions, a shell type will only be made this way when the input function is written in C.

In PostgreSQL versions before 7.3, it was customary to avoid creating a shell type at all, by replacing the functions' forward references to the type name with the placeholder pseudotype opaque. The cstring arguments and results also had to be declared as opaque before 7.3. To support loading of old dump files, CREATE TYPE will accept I/O functions declared using opaque, but it will issue a notice and change the function declarations to use the correct types.

示例

This example creates a composite type and uses it in a function definition:

CREATE TYPE compfoo AS (f1 int, f2 text);

CREATE FUNCTION getfoo() RETURNS SETOF compfoo AS $$
    SELECT fooid, fooname FROM foo
$$ LANGUAGE SQL;

This example creates an enumerated type and uses it in a table definition:

CREATE TYPE bug_status AS ENUM ('new', 'open', 'closed');

CREATE TABLE bug (
    id serial,
    description text,
    status bug_status
);

This example creates a range type:

CREATE TYPE float8_range AS RANGE (subtype = float8, subtype_diff = float8mi);

This example creates the base data type box and then uses the type in a table definition:

CREATE TYPE box;

CREATE FUNCTION my_box_in_function(cstring) RETURNS box AS ... ;
CREATE FUNCTION my_box_out_function(box) RETURNS cstring AS ... ;

CREATE TYPE box (
    INTERNALLENGTH = 16,
    INPUT = my_box_in_function,
    OUTPUT = my_box_out_function
);

CREATE TABLE myboxes (
    id integer,
    description box
);

If the internal structure of box were an array of four float4 elements, we might instead use:

CREATE TYPE box (
    INTERNALLENGTH = 16,
    INPUT = my_box_in_function,
    OUTPUT = my_box_out_function,
    ELEMENT = float4
);

which would allow a box value's component numbers to be accessed by subscripting. Otherwise the type behaves the same as before.

This example creates a large object type and uses it in a table definition:

CREATE TYPE bigobj (
    INPUT = lo_filein, OUTPUT = lo_fileout,
    INTERNALLENGTH = VARIABLE
);
CREATE TABLE big_objs (
    id integer,
    obj bigobj
);

Section 35.11中有更多例子,包括合适的输入和输出函数。

兼容性

The first form of the CREATE TYPE command, which creates a composite type, conforms to the SQL standard. The other forms are PostgreSQL extensions. The CREATE TYPE statement in the SQL standard also defines other forms that are not implemented in PostgreSQL.

The ability to create a composite type with zero attributes is a PostgreSQL-specific deviation from the standard (analogous to the same case in CREATE TABLE).

另见

ALTER TYPE, CREATE DOMAIN, CREATE FUNCTION, DROP TYPE