Files with types

abstract: Objects should be typed to indicate what operations are possilbe with them; file extensions mostly serve this goal. Make the types stronger and carry them over to the content of the file when it is read.

Types describe what we can do with an object

An object of a type is associated with the operations described for this type; a type is related to operations which have objects of this type as input or output.

Unfortunately it is rarely the case that a single algebra (a Haskell class) describe the operations exhaustively. Nevertheless, it is desirable, that the foundation of a type are few operations which describe the intended semantics of the type.

Files have a type

Operations on files are typically associated with one or a few programs. For example, files with content written in one programming language, e.g. Haskell, can be read with a Haskell compiler, e.g. GHC, and a few utilities. Same for other programming languages, but also for editors, where the same editor should be used to read and write a file.

Conversions translate files of one type to another one. Editors allow export of content in various other formats (e.g. PDF), compilers produce object code in files, which can be linked and produce executables.

File types are commonly marked with extensions

It is customary to attach to a filename an extension - mostly 3 character labels - which indicate the file type; typically the language in which the content is encoded. Examples are '.hs' or '.lhs' indicate Haskell source code, '.tex' indicates LaTeX code etc.

Files are encoded as binary or character strings -- today usually most often encoded as UTF8, finally.

Reading a file should preserve the type

When reading a file in a program, the result should preserver the type of the file. In Haskell as a wrapper around the underlying encoding of the file: a '.tex' file is read and encoded internally as some character string (String, Text, Bytestring or similar) and should be wrapped in a wrapper to indicate the idea that the file contains LaTeX code and can be processed with programs which understand LaTeX.

Implementation

Define a class `FileType

data TypedFile5 a b = TypedFile5 { tpext5 :: Extension}

with operations

class FileHandles a => **TypedFiles7a** a b where
-- | the 7 have two arguments for path and file
    read7 :: Path Abs Dir -> Path Rel File -> TypedFile5 a b ->   ErrIO b
    write7 :: Path Abs Dir -> Path Rel File -> TypedFile5 a b -> b -> ErrIO ()

    -- | the 8 versions have a single argument for path and file
    read8 :: Path Abs File -> TypedFile5 a b -> ErrIO b
    write8 :: Path Abs File -> TypedFile5 a b -> b -> ErrIO ()
    -- ^ the createDir if missing is implied in the write

and a (trivial) class for the wrapping

-- | the a is the base type
-- which is written on file, b is the type for input and output
class FileHandles a => **TypedFiles7** a b where
    wrap7 :: a -> b
    unwrap7 :: b -> a

A new file type needs two type parameters and an extension:

describe the base type

in which it is written to disk; this must be a type the operations to read and write to disk must be instantiated.
Instances can be generic for any b for which the wrappers are instantiated

instance (TypedFiles7 L.ByteString b) => TypedFiles7a L.ByteString b where

a wrapper type

which is a

newtype YamlText = YamlText Text deriving (Show, Read, Eq, Ord)

for which an

instance TypedFiles7 Text  YamlText    where

-- handling Markdown and read them into YamlText wrap7 = YamlText unwrap7 (YamlText a) = a

and an extension and a constant

extYAML = Extension "yaml"

yamlFileType :: TypedFile5 Text YamlText

instance Zeros YamlText where zero = YamlText zero

yamlFileType = TypedFile5 {tpext5 = extYAML} :: TypedFile5   Text YamlText

is required.

TypeSave reading and writing

the read and write operations have a filetype argument instead of an extension; the filename must not include an extension (respectively if it has one, it is removed when the read or write operation adds the one associated with the )

Produced with SGG on with master5.dtpl.