Age | Commit message (Collapse) | Author | Files | Lines |
|
Signed-off-by: Jamal Hadi Salim <[email protected]>
Acked-by: Cong Wang <[email protected]>
|
|
Useful to know when the action was first used for accounting
(and debugging)
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Signed-off-by: Nicolas Dichtel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Currently tc actions are stored in a per-module hashtable,
therefore are visible to all network namespaces. This is
probably the last part of the tc subsystem which is not
aware of netns now. This patch makes them per-netns,
several tc action API's need to be adjusted for this.
The tc action API code is ugly due to historical reasons,
we need to refactor that code in the future.
Cc: Jamal Hadi Salim <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We only release the memory of the hashtable itself, not its
entries inside. This is not a problem yet since we only call
it in module release path, and module is refcount'ed by
actions. This would be a problem after we move the per module
hinfo into per netns in the latter patch.
Cc: Jamal Hadi Salim <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
tcf_hash_destroy() used once. Make it static.
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Conflicts:
arch/s390/net/bpf_jit_comp.c
drivers/net/ethernet/ti/netcp_ethss.c
net/bridge/br_multicast.c
net/ipv4/ip_fragment.c
All four conflicts were cases of simple overlapping
changes.
Signed-off-by: David S. Miller <[email protected]>
|
|
Since commit 55334a5db5cd ("net_sched: act: refuse to remove bound action
outside"), we end up with a wrong reference count for a tc action.
Test case 1:
FOO="1,6 0 0 4294967295,"
BAR="1,6 0 0 4294967294,"
tc filter add dev foo parent 1: bpf bytecode "$FOO" flowid 1:1 \
action bpf bytecode "$FOO"
tc actions show action bpf
action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe
index 1 ref 1 bind 1
tc actions replace action bpf bytecode "$BAR" index 1
tc actions show action bpf
action order 0: bpf bytecode '1,6 0 0 4294967294' default-action pipe
index 1 ref 2 bind 1
tc actions replace action bpf bytecode "$FOO" index 1
tc actions show action bpf
action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe
index 1 ref 3 bind 1
Test case 2:
FOO="1,6 0 0 4294967295,"
tc filter add dev foo parent 1: bpf bytecode "$FOO" flowid 1:1 action ok
tc actions show action gact
action order 0: gact action pass
random type none pass val 0
index 1 ref 1 bind 1
tc actions add action drop index 1
RTNETLINK answers: File exists [...]
tc actions show action gact
action order 0: gact action pass
random type none pass val 0
index 1 ref 2 bind 1
tc actions add action drop index 1
RTNETLINK answers: File exists [...]
tc actions show action gact
action order 0: gact action pass
random type none pass val 0
index 1 ref 3 bind 1
What happens is that in tcf_hash_check(), we check tcf_common for a given
index and increase tcfc_refcnt and conditionally tcfc_bindcnt when we've
found an existing action. Now there are the following cases:
1) We do a late binding of an action. In that case, we leave the
tcfc_refcnt/tcfc_bindcnt increased and are done with the ->init()
handler. This is correctly handeled.
2) We replace the given action, or we try to add one without replacing
and find out that the action at a specific index already exists
(thus, we go out with error in that case).
In case of 2), we have to undo the reference count increase from
tcf_hash_check() in the tcf_hash_check() function. Currently, we fail to
do so because of the 'tcfc_bindcnt > 0' check which bails out early with
an -EPERM error.
Now, while commit 55334a5db5cd prevents 'tc actions del action ...' on an
already classifier-bound action to drop the reference count (which could
then become negative, wrap around etc), this restriction only accounts for
invocations outside a specific action's ->init() handler.
One possible solution would be to add a flag thus we possibly trigger
the -EPERM ony in situations where it is indeed relevant.
After the patch, above test cases have correct reference count again.
Fixes: 55334a5db5cd ("net_sched: act: refuse to remove bound action outside")
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Reuse existing percpu infrastructure John Fastabend added for qdisc.
This patch adds a new cpustats parameter to tcf_hash_create() and all
actions pass false, meaning this patch should have no effect yet.
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Acked-by: John Fastabend <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Not used.
pedit sets TC_MUNGED when packet content was altered, but all the core
does is unset MUNGED again and then set OK2MUNGE.
And the latter isn't tested anywhere. So lets remove both
TC_MUNGED and TC_OK2MUNGE.
Signed-off-by: Florian Westphal <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
After previous patches to simplify qstats the qstats can be
made per cpu with a packed union in Qdisc struct.
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This removes the use of qstats->qlen variable from the classifiers
and makes it an explicit argument to gnet_stats_copy_queue().
The qlen represents the qdisc queue length and is packed into
the qstats at the last moment before passnig to user space. By
handling it explicitely we avoid, in the percpu stats case, having
to figure out which per_cpu variable to put it in.
It would probably be best to remove it from qstats completely
but qstats is a user space ABI and can't be broken. A future
patch could make an internal only qstats structure that would
avoid having to allocate an additional u32 variable on the
Qdisc struct. This would make the qstats struct 128bits instead
of 128+32.
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In order to run qdisc's without locking statistics and estimators
need to be handled correctly.
To resolve bstats make the statistics per cpu. And because this is
only needed for qdiscs that are running without locks which is not
the case for most qdiscs in the near future only create percpu
stats when qdiscs set the TCQ_F_CPUSTATS flag.
Next because estimators use the bstats to calculate packets per
second and bytes per second the estimator code paths are updated
to use the per cpu statistics.
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
It is possible by passing a netlink socket to a more privileged
executable and then to fool that executable into writing to the socket
data that happens to be valid netlink message to do something that
privileged executable did not intend to do.
To keep this from happening replace bare capable and ns_capable calls
with netlink_capable, netlink_net_calls and netlink_ns_capable calls.
Which act the same as the previous calls except they verify that the
opener of the socket had the desired permissions as well.
Reported-by: Andy Lutomirski <[email protected]>
Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We could allocate tc_action on stack in tca_action_flush(),
since it is not large.
Also, we could use create_a() in tcf_action_get_1().
Cc: Jamal Hadi Salim <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When an action is bonnd to a filter, there is no point to
remove it outside. Currently we just silently decrease the refcnt,
we should reject this explicitly with EPERM.
Cc: Jamal Hadi Salim <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Cc: Jamal Hadi Salim <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
For bindcnt and refcnt etc., they are common for all actions,
not need to repeat such operations for their own, they can be unified
now. Actions just need to do its specific cleanup if needed.
Cc: Jamal Hadi Salim <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Now we can totally hide it from modules. tcf_hash_*() API's
will operate on struct tc_action, modules don't need to care about
the details.
Cc: Jamal Hadi Salim <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
So that we will not expose struct tcf_common to modules.
Cc: Jamal Hadi Salim <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Every action ops has a pointer to hash info, so we don't need to
hard-code it in each module.
Cc: Jamal Hadi Salim <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
It is not necessary at all.
Cc: Jamal Hadi Salim <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Refactor tcf_add_notify() and factor out tcf_del_notify().
Cc: Jamal Hadi Salim <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
There is no need to store the index separatedly
since tcf_hashinfo is allocated statically too.
Cc: Jamal Hadi Salim <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
action flushing missaccounting
Account only for deleted actions
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Remove unnecessary checks for act->ops
(suggested by Eric Dumazet).
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
No need to export functions only used in one file.
Signed-off-by: Stephen Hemminger <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
list_for_each_entry(a, &act_base, head) doesn't
exit with a = NULL if we reached the end of the list.
tcf_unregister_action(), tc_lookup_action_n() and tc_lookup_action()
need fixes.
Remove tc_lookup_action_id() as its unused and not worth 'fixing'
Signed-off-by: Eric Dumazet <[email protected]>
Fixes: 1f747c26c48b ("net_sched: convert tc_action_ops to use struct list_head")
Cc: Cong Wang <[email protected]>
Reviewed-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We don't need to maintain our own singly linked list code.
Cc: Jamal Hadi Salim <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
So that we don't need to play with singly linked list,
and since the code is not on hot path, we can use spinlock
instead of rwlock.
Cc: Jamal Hadi Salim <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
It looks weird to store the lock out of the struct but
still points to a static variable. Just move them into the struct.
Cc: Jamal Hadi Salim <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Currently actions are chained by a singly linked list,
therefore it is a bit hard to add and remove a specific
entry. Convert it to struct list_head so that in the
latter patch we can remove an action without finding
its head.
Cc: Jamal Hadi Salim <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
It is not used.
Cc: Jamal Hadi Salim <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
return is not a function, parentheses are not required.
Signed-off-by: Yang Yingliang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Signed-off-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
With decnet converted, we can finally get rid of rta_buf and its
computations around it. It also gets rid of the minimal header
length verification since all message handlers do that explicitly
anyway.
Signed-off-by: Thomas Graf <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Eric Dumazet pointed out that act_mirred needs to find the current net_ns,
and struct net pointer is not provided in the call chain. His original
patch made use of current->nsproxy->net_ns to find the network namespace,
but this fails to work correctly for userspace code that makes use of
netlink sockets in different network namespaces. Instead, pass the
"struct net *" down along the call chain to where it is needed.
This version removes the ifb changes as Eric has submitted that patch
separately, but is otherwise identical to the previous version.
Signed-off-by: Benjamin LaHaise <[email protected]>
Tested-by: Eric Dumazet <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
- In rtnetlink_rcv_msg convert the capable(CAP_NET_ADMIN) check
to ns_capable(net->user-ns, CAP_NET_ADMIN). Allowing unprivileged
users to make netlink calls to modify their local network
namespace.
- In the rtnetlink doit methods add capable(CAP_NET_ADMIN) so
that calls that are not safe for unprivileged users are still
protected.
Later patches will remove the extra capable calls from methods
that are safe for unprivilged users.
Acked-by: Serge Hallyn <[email protected]>
Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
It is a frequent mistake to confuse the netlink port identifier with a
process identifier. Try to reduce this confusion by renaming fields
that hold port identifiers portid instead of pid.
I have carefully avoided changing the structures exported to
userspace to avoid changing the userspace API.
I have successfully built an allyesconfig kernel with this change.
Signed-off-by: "Eric W. Biederman" <[email protected]>
Acked-by: Stephen Hemminger <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Move away from NLMSG_NEW() as well.
And use nlmsg_data() while we're here too.
Signed-off-by: David S. Miller <[email protected]>
|
|
These macros contain a hidden goto, and are thus extremely error
prone and make code hard to audit.
Signed-off-by: David S. Miller <[email protected]>
|
|
With calls to modular infrastructure, these files really
needs the full module.h header. Call it out so some of the
cleanups of implicit and unrequired includes elsewhere can be
cleaned up.
Signed-off-by: Paul Gortmaker <[email protected]>
|
|
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The message size allocated for rtnl ifinfo dumps was limited to
a single page. This is not enough for additional interface info
available with devices that support SR-IOV and caused a bug in
which VF info would not be displayed if more than approximately
40 VFs were created per interface.
Implement a new function pointer for the rtnl_register service that will
calculate the amount of data required for the ifinfo dump and allocate
enough data to satisfy the request.
Signed-off-by: Greg Rose <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
The rcu callback tcf_common_free_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(tcf_common_free_rcu).
Signed-off-by: Lai Jiangshan <[email protected]>
Acked-by: David S. Miller <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Reviewed-by: Josh Triplett <[email protected]>
|
|
Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: Lucas De Marchi <[email protected]>
|
|
Cleanup net/sched code to current CodingStyle and practices.
Reduce inline abuse
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
gen_kill_estimator() API is incomplete or not well documented, since
caller should make sure an RCU grace period is respected before
freeing stats_lock.
This was partially addressed in commit 5d944c640b4
(gen_estimator: deadlock fix), but same problem exist for all
gen_kill_estimator() users, if lock they use is not already RCU
protected.
A code review shows xt_RATEEST.c, act_api.c, act_police.c have this
problem. Other are ok because they use qdisc lock, already RCU
protected.
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|