GUI
GUI Layers and features in traditional Window Icon Mouse Pointer (WIMP) interfaces
- Font, characters, text and spacing:
- Subpixel hinting
- Colored fonts (used e.g. for colored emojis)
- Fonts with ligatures
- kerning
- line-spacing (a lot of fonts have a variable height)
- A font can have parts of the character outside of the box used to
position the characters, e.g. the arm of a capital L may continue
under the following letters in Loki Cola
- Font formats:
- bitmap vs. vector.
- Format names: otf, metafont, ttf, …
- Available features differ in formats, e.g. color, kinds of
curves (degree of bézier curves)
- Variable-width fonts
- Flexible spacing (glue in TeX)
- Hyphenation
- Knuth algorithm for text reflow and hyphenation (used in TeX)
- Font size, dpi and zoom level (font size is not just a zoom
because the font weight should vary slightly based on the
on-screen size)
- On-screen size of a text is hard to determine in advance, i.e. a
call to size(text, zoomlevel * fontsize) is needed.
- In order to avoid reflow when zooming, the font and text layout
must avoid changing the width of a string when the zoom factor
changes.
- HiDPI displays
- Decomposition of characters:
- an individual character in the unicode list of characters,
e.g. "e"
- Code point = the number / identifier assigned to a glyph by the
Unicode standard, e.g. U+0065 for "e"
- Grapheme = sequence of code points that are combined together to
visually form a single entity, e.g. U+0065 (latin small letter
E) combines with U+0301 (combining acute accent) to give the
grapheme "é" (this grapheme also has a legacy encoding as a
single code point, but not all combinations have an equivalent
code point). source: https://stackoverflow.com/a/27331885
- Unicode combining diacritics and joiners are used for this
(used for accentuated characters and emojis)
- Glyph = an image, stored in a font, for a given character
- Subscript, superscript, sub-subscripts, symbols like \sum with
scripts below and above the symbol
- Left-To-Right and Right-To-Left, boustrophedon (alternating LTR
and RTL), Vertical and horizontal directions.
- These also affect the way selection and cursor navigation works
(the current status in some widely-used GUIs is very bad when
RTL and LTR are mixed).
- Ambiguity of displayed Unicode characters
- "Inline string" data type should contain a mix of inline elements.
- The gist is that:
- an "inline string" is a sequence of character-like entities
- it is displayed with carriage returns only when the end of the
line is reached
- it should be able to read an "inline string" aloud, where the
few graphical elements are described when reading
- if it would make sense to have a ligature between two
characters, they should be in the same inline string when sent
to display (the opposite is not true, two characters can be in
the same inline string but displayed without a ligature
e.g. because they are written in different languages)
- characters in unicode or other encodings
- emojis and mathematical symbols that do not have a codepoint yet
- spark graphs (small inline line graph of the variation of some
quantity)
- small inline directed or undirected graphs
- other small pictures
- anchors (to mark special points in the text e.g. to mark the
destination of an arrow)
- subelements (to mark the start and end of hyperlinks, bold,
highlight, annotations, tooltips, languages and so on).
- An "inline string" is layed out as a
flow (see Layout below)
- An "inline string" can be processed as a stream of
character-like entities + delimiters by doing an inorder
traversal of the tree of subelements.
- JetBrains MPS and GNU Emacs use something similar to describe
GUIs to display ASTs
- There should be a way to display the text cursor (e.g. vertical
bar) without moving the rest of the text
- Icons
- vectorial
- low-resolution vectorial (Haiku OS uses a special vectorial
image format for low-resolution icons).
- bitmap
- support for alternative icon themes
- composition of icons (emblems on top of file icons to indicate
important, read-only, shortcut etc.)
- Selection
- The Humane Interface suggests what seems like the best approach: a cursor
is both an insertion point marked with a vertical bar with a small
curve towards the next character (l-shaped bar) and a selection
marked with a grey background.
- When inserting a character, e.g. "a", it is inserted on the side
of the vertical bar indicated by its small curve.
- When deleting or copying, the selection (grey part) is used.
- This way it is clear what both kinds of operations do.
- In The Humane Interface there is a chapter on selection, with plenty of
good ideas, e.g. clicking in the middle of a character selects
that character and puts the insertion cursor to the right of that
character. That way, the user does not need to try to aim the
small space between two characters. Most guis put the cursor on
the left (or right) of the character for clicks in its left (right
respectively) half, but it is difficult to learn this fact
(because the area of half a character is so small, it is not
evident that the effect was linked to the position of the input
click within the character).
- Multiple selections, multiple insertion cursors
- Click: query a quadtree, kd-tree or object ID bitmap to know in
which text field and on which
- Click+drag: query widget and character for click and each
subsequent mouse position, and highlight the elements
inbetween. Needs an order on the elements. This order is the
natural one for sequences of inline widgets. When the mouse cursor
is not on a selectable object (outside of the region containing
the sequence of inline elements, or over a non-selectable part of
that region) it is necessary to find the closest element that is
contiguously selectable with respect to the initial click. In
general this cannot be done with an object bitmap (as the answer
depends on where the first click occurred), and a more clever data
structure is needed.
- On X.org selecting text copies it to the selection
clipboard. An explicit keyboard shortcut or context menu action
copies to the separate Ctrl+C Ctrl+V keyboard. Some applications
or GUI toolkits accidentally overwrite the selection clipboard
when focus changes, care should be taken to only set the selection
keyboard after an explicit selection with the mouse (drag) or
keyboard (shift+motion keys).
- Human input devices
- Pointer input
- Mouse is relative
- Mouse motion is usually integrated to obtain an absolute
position on-screen.
- In games mouse motion is used as a relative measure to
indicate rotation of the camera.
- Touchscreen, tablet and eye tracking are absolute
- Keyboard input
- input method (XIM, UIM, SCIM, IBus …)
- Android keyboard and on-screen keyboards
- n-key rollover
- capacity to detect many keys pressed at the same time.
- many keyboards handle modifier keys specially, this means that
modifiers cannot be mapped elsewhere on the keyboard, lest
some key combinations like Shift+Ctrl+C might not be detected
correctly.
- The Fn modifier key on laptops is often handled directly by
the firmware, and does not send an actual keycode. This
prevents using that key as non-modifier key or as a modifier
in combinations which are not explicitly supported by the
firmware Some laptops (e.g. some ThinkPad laptops) send an
actual scancode for that key.
- multimedia keys (volume down, calculator, suspend computer, …)
- scancodes (sent directly by keyboard for down/up events)
- keyboard layout: mapping from scancodes to keysyms (default
layout, switch layout with a key combination of GUI widget,
customization of the layout with e.g. Xmodmap)
- keyboard macros (often implemented in hardware because of the
lack of good software support)
- XCompose is a sort of keyboard macro mechanism: the user
inputs a sequence of keys, which are translated to one (or
several) Unicode symbol(s)
- application has access to scancodes, mapped keysyms and the
output of XCompose macros. Some GUI toolkits do not provide
access to all information.
- modifier keys can act as regular keys (e.g. super is both a
modifier and a key that opens the application menu)
- multiple input devices connected to the same machine
- keyboard
- separate numeric pad
- pedals
- power management buttons (on/off/sleep button)
- USB devices can act as keyboards (e.g. iPod with
RockBox simulates volume and previous/next keys when the
buttons on the device are pressed).
- multiseat: multiple input and output devices used by several users
at the same time. Input devices are dispatched to the session of
the corresponding user.
- Focus:
- next / previous at several levels (left/right vs. alt+tab are extremes)
- in/out for zoomable and hierarchical interfaces.
- notify system when the focus changes (for accessibility etc.)
- receive from the system the order to place the focus somewhere
(including inside a menu), for "show me" in help files, tutorials
etc.
- Semantic structure of the interface
- The semantic structure may not match the layout structure (e.g. a
sequence of buttons may be displayed as several separate
horizontal containers).
- Tab order (order in which the elements are cycled through via the
Tab key)
- It should be possible to read all elements using a screen reader
- It should be possible to place the focus on all actionable
elements, including a way to copy read-only text that does not
normally receive focus.
- To this effect the the interface can be linearized, or better
structured in a hierarchical form (to avoid the need to focus many
elements before the desired one).
- Report position of widgets
- to help black-box GUI testing
- so that help files and tutorials can show where an element is
- so that accessibility options can highlight the element under focus
- needed internally to display tooltips
- Layout:
- Flow
- inline (like
<span> in HTML)
- other (relative positioning of blocks)
- connected text zones (text can start filling one zone and then
continue to spill in the next one)
- table-like alignment (comments aligned on the right, similar
function calls formatted like a table, labels and input fields, …)
- other alignment of widgets, macOS uses the Cassowary Constraint Solving Toolkit to
solve alignment constraints.
- Display and update overlapping widgets
- Network transparency
- In X.org, applications send streams of commands (including
OpenGL commands, see notes on compositing) to the display server. This
model allows the program and the server to be on different
machines.
- Other systems require detection of updated parts of the screen
(dirty rectangles) and sending the pixels in those portions of the
image to the display machine.
- HTML pushes the toolkit on the client-side, while the server-side
may perform the computation.
- UI construction and update paradigm
- Reactive
- Functinal Reactive Programming
- Object Oriented
- …
- Accessibility
- Features
- color inversion (can be done by compositing)
- constrained color palettes, e.g. high contrast or inverse high
contrast (done by toolkit (theme) or compositing)
- edge detection on images
- alternative icons: simpler symbolic icons, text instead of icons
(done by toolkit)
- zoom (done by compositing usually for high amounts of zoom,
but increasing font size for the document and increasing the
size of toolbars, menus and other widgets should be handled by
the application, to support HiDPI screens)
- screen readers
- voice fonts
- explicit mention of significant parentheses, submenus, links
and other visual clues when they are necessary to a correct
understanding of the interface.
- linearized or hierarchical interface
- dim less relevant parts of the screen (e.g. background windows) for ADD
- braille terminals
- keyboard-only navigation for everything except freehand drawing
- mouse-only navigation for everything except text input (the
system usually provides a virtual keyboard)
- eeg-only or eye-tracking-only or voice-input-only navigation
for paralized people and hands-free usage
- stdin/stdout or API so that the program may be controlled by
another program or via an SSH session.
- Frameworks / APIs
- Wai-ARIA on the web
- Orca for GNOME
- If the GUI toolkit is well designed (semantic description
instead of ad-hoc mutation of widget properties and non-semantic
drawing on canvases), then accessibility features should be easy
to implement.
- From game dev
- Level of Detail (LOD)
- Z-buffer: during compositing, to detect wheter a pixel should be
updated or if it is hidden by another GUI element
- Object ID buffer or quadree / kd-tree (to assign clicks)
- Raytracing
- List of common shortcuts
- Not all are wise choices, and some are overloaded, but differences
from the norm should have a reason
- Ctrl+PgUp previous (left) tab (onglet)
- Ctrl+PgDown next (right) tab (onglet)
- Tab or Ctrl+Space autocomplete (various choices for whether to
apply autocomplete on Enter, Tab, Right or other)
- Tab next field
- Shift+Tab previous field
- Shift+Arrow move cursor and add to selection at the same time
(usually resets the selection if the previous command was not a
Shift+Motion key)
- Ctrl+Left, Ctrl+Right move cursor by word
- Ctrl+Up, Ctrl+Down move cursor by paragraph
- Shift+Ctrl+Arrow same as Ctrl+Arrow but also add to selection
everything between the previous and new cursor positions. Usually
resets the selection if the previous command was not a
Shift+Motion key
- Shift+Click add to selection (all elements between last one
selected and the one just clicked).
- If already selected, usually remove.
- If the range contains a mix of already selected and unselected
elements, unsure (probably add all, could be toggle)
- Ctrl+Click add to selection (just this element) / remove if already added
- Shift+Click+Drag: add to / remove from selection (may also add all
the elements between the last one selected and the one clicked)
- Ctrl+Click+Drag: add to / remove from selection
- Click: set focus and position text cursor
- Click+Drag: select and position text cursor
- F2 Help
- F2 Rename
- Ctrl+C copy
- Ctrl+X cut
- Ctrl+V paste
- Ctrl+S save
- Ctrl+W close document (in the current tab (onglet) or window)
- Ctrl+F4 close document (in the current tab)
- Alt+F1 open start menu, handled by the window manager
- Alt+F2 execute (usually handled by the window manager)
- Alt+F4 close window (all documents in current window), handled by
the window manager
- F5 Run program (IDEs) or refresh page (browsers)
- Ctrl+A select all
- Ctrl+Z undo
- Ctrl+T open new file (possibly in a tab (onglet))
- Ctrl+T open new tab (onglet)
- Ctrl+Tab, Ctrl+Shift+Tab next/previous (sub-)window within the
application or next/previous tab (onglet), sometimes handled by
the window manager to cycle between windows, can use a stack
(Alt+Tab, release, Alt+Tab is a no-op) or ring model (usually
annoying to use)
- Alt+Tab, handled by the window manager, can use a stack (better)
or ring (annoying) model
- Ctrl+I Ctrl+U Ctrl+B Italics, Underline, Bold
- Alt+left-click+drag on X.org move window, often handled by
window manager
- Alt+middle-click+drag or Alt+right-click+drag on X.org resize
window, often handled by window manager, the quadrant or octant
clicked indicates which corner or edge of the window should be
moved
- Enter validate
- Escape cancel, discard input and close, for dialog boxes and
prompts
- RightClick context menu
- Long press on touch screen: context menu or select or move
- List of common GUI features, widgets and conventions
- Tool tips
- Status bar
- Drag and drop
- Standard widgets with a well-understood meaning: buttons,
checkboxes, radio buttons, toggle buttons (usually poor UI),
separators etc.
- Scroll:
- scroll bar
- smooth scroll
- infinite scroll
- zoom on scroll bar (e.g. for video editing or playing, to select
a precise second in a 2-hour progress bar)
- Toolbars, menu bars, "hamburger" menus
- _Access key (in a menu entry or for a button: a key which will
triger an entry is indicated by underlining that character
somewhere in the entry's name)
- Shortcuts on the right of menus
- Blue background indicates the currently focused selection.
- Other selections are grey.
- A black text cursor indicates the currently focused cursor.
- Unfocused text cursors may be grey or invisible.
- Dotted line on the edge of a button or hyperlink indicates it is focused.
- Focused tab is highlighted (no line separating it from the contents, blue background, …)
- bold label indicates default button (often OK)
- Feedback for actions:
- when clicked a button gets depressed
- when clicked a link changes color
- action in progress indicated by mouse cursor change, progress
bar or spinner
- Integration with the rest of the system
- File browser integration, e.g. TortoiseSVN and TortoiseGIT
- Progress bar in application icon (shown in task bar and Alt+Tab)
- Updatable application icon (shown in task bar and Alt+Tab)
- Notifications on application icon, e.g. in a colored bubble the number
of undread messages or an exclamation mark (shown in task bar and
Alt+Tab)
- Notification bubbles with buttons
- Updatable systray icon with menu, arbitrary systray widget
- Detect loss of focus on the window, detect window minimization
- Detect window resize
- Custom title bar (firefox and chromium merge tabs into the title bar)
- Borderless, fullscreen
- Clipboard: two clipboards in X.org, the usual Ctrl+C Ctrl+V
clipboard may contain text, files, bitmap pictures and other
applictaion-specific data. Often this data can be converted to a
string as a fallback.
- GUI features that are useful or common on a given platform
- TODO: add reference to ergonomy survey
- TODO: capabilities based on the connection (e.g. remote users cannot
shut down the machine)
In this folder
Layout engine
Zoomable user interface