syslog-agent.sources = syslog
syslog-agent.channels = memoryChannel-1
syslog-agent.sinks = logger
syslog-agent.sources.syslog.type = syslogudp
syslog-agent.sources.syslog.port = 5140
syslog-agent.sources.syslog.host = localhost
FLUME SINKS
HDFS Sink
This sink writes the event into the Hadoop Distributed File System (HDFS). It currently
supports creating text and sequence files. It supports compression in both file types. The files can be
rolled (close current file and create a new one) periodically based on the elapsed time or size of data or
number of events. It also bucketing/partitioning data by attributes like timestamp or machine where the
event originated. The HDFS directory path may contain formatting escape sequences that will replaced
by the HDFS sink to generate a directory/file name to store the events.
Following are the escape sequences supported -
%{host}
|
host name stored in event header
|
%t
|
Unix time in milliseconds
|
%a
|
locales short weekday name (Mon, Tue, )
|
%A
|
locales full weekday name (Monday, Tuesday, )
|
%b
|
locales short month name (Jan, Feb,)
|
%B
|
locales long month name (January, February,)
|
%c
|
locales date and time (Thu Mar 3 23:05:25 2005)
|
%d
|
day of month (01)
|
%D
|
date; same as %m/%d/%y
|
%H
|
hour (00..23)
|
%I
|
hour (01..12)
|
%j
|
day of year (001..366)
|
%k
|
hour ( 0..23)
|
%m
|
month (01..12)
|
%M
|
minute (00..59)
|
%P
|
locales equivalent of am or pm
|
%s
|
seconds since 1970-01-01 00:00:00 UTC
|
%S
|
second (00..59)
|
%y
|
last two digits of year (00..99)
|
%Y
|
year (2010)
|
%z
|
+hhmm numeric timezone (for example, -0400)
|
The file in use will have the name mangled to include .tmp at the end. Once the file is closed,
this extension is removed. This allows excluding partially complete files in the directory.
Name
|
Default
|
Description
|
type
|
-
|
The component type name, needs to be hdfs
|
hdfs.path
|
-
|
HDFS directory path (eg hdfs://namenode/flume/webdata/)
|
hdfs.filePrefix
|
FlumeData
|
Name prefixed to files created by Flume in hdfs directory
|
hdfs.rollInterval
|
30
|
Number of seconds to wait before rolling current file
|
hdfs.rollSize
|
1024
|
File size to trigger roll (in bytes)
|
hdfs.rollCount
|
10
|
Number of events written to file before it rolled
|
hdfs.batchSize
|
1
|
number of events written to file before it flushed to HDFS
|
hdfs.txnEventMax
|
100
| |
hdfs.codeC
|
-
|
Compression codec. one of following :
gzip, bzip2, lzo, snappy
|
hdfs.fileType
|
SequenceFile
|
File format - currently SequenceFile or DataStream
|
hdfs.maxOpenFiles
|
5000
| |
hdfs.writeFormat
|
-
|
Text or Writable
|
hdfs.appendTimeout
|
1000
| |
hdfs.callTimeout
|
5000
| |
hdfs.threadsPoolSize
|
10
| |
hdfs.kerberosPrincipal
|
|
Kerberos user principal for accessing secure HDFS
|
hdfs.kerberosKeytab
|
|
Kerberos keytab for accessing secure HDFS
|
Logger Sink
Logs event at INFO level. Typically useful for testing/debugging purpose.
This sink has no properties.
type
|
-
|
The component type name, needs to be logger
|
Avro
This sink forms one half of Flume's tiered collection support. Flume events sent to this sink are turned into Avro events and sent to the configured hostname / port pair. The events are taken from the configured Channel in batches of the configured batch size.
Property Name
|
Default
|
Description
|
type
|
-
|
The component type name, needs to be avro
|
hostname
|
-
|
The hostname or IP address to bind to
|
port
|
-
|
The port # to listen on
|
batch-size
|
100
|
number of event to batch together for send.
|
IRC
The IRC sink takes messages from attached channel and relays those to configured IRC
destinations.
PropertyName
|
Default
|
Description
|
type
|
-
|
The component type name, needs to be irc
|
hostname
|
-
|
The hostname or IP address to connect to
|
port
|
6667
|
The port number of remote host to connect
|
nick
|
-
|
Nick name
|
user
|
-
|
User name
|
password
|
-
|
User password
|
chan
|
-
|
channel
|
name
| | |
splitlines
|
-
|
(boolean)
|
splitchars
|
\n
|
line separator (if you were to enter the default value into the config file, the you would need to escape the backslash, like this: \\n)
|
FLUME CHANNELS
Channels are the repositories where the events are staged on a agent. Source adds the events and Sink removes it.
Memory Channel
The events are stored in a an in-memory queue with configurable max size. Its ideal for flow that needs higher throughput and prepared to lose the staged data in the event of a agent failures.
Property Name
|
Default
|
Description
|
type
|
-
|
The component type name, needs to be memory
|
capacity
|
100
|
The max number of events stored in the channel
|
transactionCapacity
|
100
|
The max number of events stored in the channel per transaction
|
keep-alive
|
3
|
Timeout in seconds for adding or removing an event
|
JDBC Channel
The events are stored in a persistent storage thats backed by a database. The JDBC channel currently supports embedded Derby. This is a durable channel thats ideal for the flows where recoverability is important.
Property Name
|
Default
|
Description
|
type
|
-
|
The component type name, needs to be jdbc
|
db.type
|
DERBY
|
Database vendor, needs to be DERBY.
|
driver.class
|
org.apache.derby.jdbc.EmbeddedDriver
|
Class for vendors JDBC driver
|
driver.url
|
(constructed from other properties)
|
JDBC connection URL
|
db.username
|
sa
|
User id for db connection
|
db.password
|
|
password for db connection
|
connection.properties.file
|
-
|
JDBC Connection property file path
|
create.schema
|
true
|
If true, then creates db schema if not there
|
create.index
|
true
|
Create indexes to speed up lookups
|
create.foreignkey
|
true
| |
transaction.isolation
|
READ_COMMITTED
|
Isolation level for db session
READ_UNCOMMITTED, READ_COMMITTED, SERIALIZABLE,
REPEATABLE_READ
|
maximum.connections
|
10
|
Max connections allowed to db
|
maximum.capacity
|
0 (unlimited)
|
Max number of events in the channel
|
sysprop.*
| |
DB Vendor specific properties
|
sysprop.user.home
| |
Home path to store embedded Derby database
|
Recoverable Memory Channel
Property Name
|
Default
|
Description
|
type
|
-
|
The component type name, needs to beorg.apache.flume.channel.recoverable.memory.
RecoverableMemoryChannel
|
wal.dataDir
|
(${user.home}/.flume/recoverable-memory-channel
| |
wal.rollSize
|
(0x04000000)
|
Max size (in bytes) of a single file before we roll
|
wal.minRetentionPeriod
|
300000
|
Min amount of time (in millis) to keep a log
|
wal.workerInterval
|
60000
|
How often (in millis) the background worker checks for old logs
|
wal.maxLogsSize
|
(0x20000000)
|
Total amt (in bytes) of logs to keep, excluding the current log
|
File Channel
NOTE: The File Channel is not yet ready for use. The options are being documented here in advance of its completion.
Property Name
|
Default
|
Description
|
type
|
-
|
The component type name, needs to be org.apache.flume.channel.file.FileChannel
|
Pseudo Transaction Channel
NOTE: The Pseudo Transaction Channel is mainly for testing purposes and is not meant for production use.
Property Name
|
Default
|
Description
|
type
|
-
|
The component type name, needs to be org.apache.flume.channel.PseudoTxnMemoryChannel
|
capacity
|
50
|
The max number of events stored in the channel
|
keep-alive
|
3
|
Timeout in seconds for adding or removing an event
|
Custom
A custom channel is your own implementation of the Channel interface. A custom channels class and its dependencies must be included in the agents classpath when starting the Flume agent. The type of the custom channel is its FQCN.
FLUME CHANNEL SELECTORS
Replicating Channel Selector (default)
Property Name
|
Default
|
Description
|
type
|
-
|
The component type name, needs to be replicating
|
Multiplexing Channel Selector
Property Name
|
Default
|
Description
|
type
|
-
|
The component type name, needs to be multiplexing
|
header
|
flume.selector.header
| |
default
|
-
| |
mapping.*
|
-
| |
Custom
A custom channel selector is your own implementation of the ChannelSelector interface. A custom channel selectors class and its dependencies must be included in the agents classpath when starting the Flume agent. The type of the custom channel selector is its FQCN.
FLUME SINK PROCESSORS
Failover Sink Processor
Property Name
|
Default
|
Description
|
type
|
-
|
The component type name, needs to be failover
|
maxpenalty
|
30000
|
(in millis)
|
priority.<sinkName>
| |
<sinkName> must be one of the sink instances associated with the current sink group
|
Default Sink Processor
Accepts only a single sink.
Property Name
|
Default
|
Description
|
type
|
-
|
The component type name, needs to be default
|
|
No comments:
Post a Comment